Incorporate changes following Paul Clarke's admirable review

Signed-off-by: Bill Schmidt <wschmidt@linux.ibm.com>
master
Bill Schmidt 3 years ago
parent ae4cd5ccc6
commit 321ac9e713

@ -804,7 +804,7 @@ a[3] = c;</programlisting>
</entry> </entry>
<entry> <entry>
<para revisionflag="added"> <para revisionflag="added">
<code><xref linkend="vec_signextll" <code><xref linkend="vec_signextq"
xrefstyle="select:title nopage"/></code> xrefstyle="select:title nopage"/></code>
</para> </para>
</entry> </entry>
@ -817,10 +817,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_mergee" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_mergee" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
<entry> <entry>
<para revisionflag="added"> <para><code><xref linkend="vec_sld" xrefstyle="select:title nopage"/></code></para>
<code><xref linkend="vec_signextq"
xrefstyle="select:title nopage"/></code>
</para>
</entry> </entry>
</row> </row>
<row> <row>
@ -831,7 +828,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_mergeh" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_mergeh" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
<entry> <entry>
<para><code><xref linkend="vec_sld" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_sldw" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
</row> </row>
<row> <row>
@ -845,7 +842,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_mergel" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_mergel" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
<entry> <entry>
<para><code><xref linkend="vec_sldw" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_sll" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
</row> </row>
<row> <row>
@ -859,7 +856,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_mergeo" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_mergeo" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
<entry> <entry>
<para><code><xref linkend="vec_sll" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_slo" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
</row> </row>
<row> <row>
@ -870,7 +867,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_mfvscr" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_mfvscr" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
<entry> <entry>
<para><code><xref linkend="vec_slo" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_slv" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
</row> </row>
<row> <row>
@ -881,7 +878,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_mule" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_mule" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
<entry> <entry>
<para><code><xref linkend="vec_slv" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_splat" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
</row> </row>
<row> <row>
@ -892,7 +889,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_mulo" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_mulo" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
<entry> <entry>
<para><code><xref linkend="vec_splat" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_srl" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
</row> </row>
<row> <row>
@ -903,7 +900,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_ncipher_be" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_ncipher_be" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
<entry> <entry>
<para><code><xref linkend="vec_srl" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_sro" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
</row> </row>
<row> <row>
@ -914,7 +911,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_ncipherlast_be" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_ncipherlast_be" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
<entry> <entry>
<para><code><xref linkend="vec_sro" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_srv" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
</row> </row>
<row> <row>
@ -925,7 +922,10 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_pack" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_pack" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
<entry> <entry>
<para><code><xref linkend="vec_srv" xrefstyle="select:title nopage"/></code></para> <para revisionflag="added">
<code><xref linkend="vec_stril"
xrefstyle="select:title nopage"/></code>
</para>
</entry> </entry>
</row> </row>
<row> <row>
@ -937,7 +937,7 @@ a[3] = c;</programlisting>
</entry> </entry>
<entry> <entry>
<para revisionflag="added"> <para revisionflag="added">
<code><xref linkend="vec_stril" <code><xref linkend="vec_stril_p"
xrefstyle="select:title nopage"/></code> xrefstyle="select:title nopage"/></code>
</para> </para>
</entry> </entry>
@ -964,7 +964,10 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_packs" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_packs" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
<entry> <entry>
<para><code><xref linkend="vec_sum2s" xrefstyle="select:title nopage"/></code></para> <para revisionflag="added">
<code><xref linkend="vec_strir_p"
xrefstyle="select:title nopage"/></code>
</para>
</entry> </entry>
</row> </row>
<row> <row>
@ -975,7 +978,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_packsu" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_packsu" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
<entry> <entry>
<para><code><xref linkend="vec_sums" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_sum2s" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
</row> </row>
<row> <row>
@ -986,7 +989,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_perm" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_perm" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
<entry> <entry>
<para><code><xref linkend="vec_unpackh" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_sums" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
</row> </row>
<row> <row>
@ -997,7 +1000,8 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_permxor" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_permxor" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
<entry> <entry>
<para><code><xref linkend="vec_unpackl" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_unpackh"
xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
</row> </row>
<row> <row>
@ -1008,7 +1012,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_pmsum_be" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_pmsum_be" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
<entry> <entry>
<para><code><xref linkend="vec_unsigned2" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_unpackl" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
</row> </row>
<row> <row>
@ -1019,7 +1023,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_reve" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_reve" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
<entry> <entry>
<para><code><xref linkend="vec_unsignede" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_unsigned2" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
</row> </row>
<row> <row>
@ -1030,7 +1034,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_sbox_be" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_sbox_be" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
<entry> <entry>
<para><code><xref linkend="vec_unsignedo" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_unsignede" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
</row> </row>
<row> <row>
@ -1044,7 +1048,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_shasigma_be" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_shasigma_be" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
<entry> <entry>
<para><code><xref linkend="vec_xl" xrefstyle="select:title nopage"/></code> (ISA 2.07 only)</para> <para><code><xref linkend="vec_unsignedo" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
</row> </row>
<row> <row>
@ -1058,7 +1062,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_signed2" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_signed2" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
<entry> <entry>
<para><code><xref linkend="vec_xl_be" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_xl" xrefstyle="select:title nopage"/></code> (ISA 2.07 only)</para>
</entry> </entry>
</row> </row>
<row> <row>
@ -1072,13 +1076,13 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_signede" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_signede" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
<entry> <entry>
<para><code><xref linkend="vec_xst" xrefstyle="select:title nopage"/></code> (ISA 2.07 only)</para> <para><code><xref linkend="vec_xl_be" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
</row> </row>
<row> <row>
<entry> <entry>
<para revisionflag="added"> <para revisionflag="added">
<code><xref linkend="vec_genwm" <code><xref linkend="vec_genpcvm"
xrefstyle="select:title nopage"/></code> xrefstyle="select:title nopage"/></code>
</para> </para>
</entry> </entry>
@ -1086,12 +1090,15 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_signedo" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_signedo" xrefstyle="select:title nopage"/></code></para>
</entry> </entry>
<entry> <entry>
<para><code><xref linkend="vec_xst_be" xrefstyle="select:title nopage"/></code></para> <para><code><xref linkend="vec_xst" xrefstyle="select:title nopage"/></code> (ISA 2.07 only)</para>
</entry> </entry>
</row> </row>
<row> <row>
<entry> <entry>
<para><code><xref linkend="vec_insert" xrefstyle="select:title nopage"/></code></para> <para revisionflag="added">
<code><xref linkend="vec_genwm"
xrefstyle="select:title nopage"/></code>
</para>
</entry> </entry>
<entry> <entry>
<para revisionflag="added"> <para revisionflag="added">
@ -1099,6 +1106,20 @@ a[3] = c;</programlisting>
xrefstyle="select:title nopage"/></code> xrefstyle="select:title nopage"/></code>
</para> </para>
</entry> </entry>
<entry>
<para><code><xref linkend="vec_xst_be" xrefstyle="select:title nopage"/></code></para>
</entry>
</row>
<row>
<entry>
<para><code><xref linkend="vec_insert" xrefstyle="select:title nopage"/></code></para>
</entry>
<entry>
<para revisionflag="added">
<code><xref linkend="vec_signextll"
xrefstyle="select:title nopage"/></code>
</para>
</entry>
<entry> <entry>
</entry> </entry>
</row> </row>
@ -1255,13 +1276,14 @@ a[3] = c;</programlisting>
introduced serious compiler complexity without much utility. introduced serious compiler complexity without much utility.
Thus this support (previously controlled by switches Thus this support (previously controlled by switches
<code>-maltivec=be</code> and/or <code>-qaltivec=be</code>) is <code>-maltivec=be</code> and/or <code>-qaltivec=be</code>) is
now deprecated. Current versions of the GCC and Clang now deprecated. Current versions of the <phrase
open-source compilers do not implement this support. revisionflag="changed">GCC, Clang, and Open XL</phrase>
compilers do not implement this support.
</para> </para>
</section> </section>
</section> </section>


<section> <section revisionflag="deleted">
<title>Language-Specific Vector Support for Other <title>Language-Specific Vector Support for Other
Languages</title> Languages</title>
<section> <section>

@ -201,11 +201,15 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_intro">
</listitem> </listitem>
<listitem> <listitem>
<para> <para>
<emphasis role="underline">The XL compilers</emphasis>. For <emphasis role="underline">The XL <phrase
XL compilers provided with the Linux Community Edition, you revisionflag="added">and OpenXL</phrase>
can provide feedback to the XL compiler team via email compilers</emphasis>. For XL <phrase
revisionflag="added">and OpenXL</phrase> compilers provided
with the Linux Community Edition, you can provide feedback
to the XL compiler team via email
(<email>compinfo@cn.ibm.com</email>); for other editions of (<email>compinfo@cn.ibm.com</email>); for other editions of
XL compilers, please open a <link XL <phrase revisionflag="added">and OpenXL</phrase>
compilers, please open a <link
xlink:href="https://www.ibm.com/mysupport/s/">Case</link>. xlink:href="https://www.ibm.com/mysupport/s/">Case</link>.
</para> </para>
</listitem> </listitem>
@ -335,6 +339,22 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_intro">
</emphasis> </emphasis>
</para> </para>
</listitem> </listitem>
<listitem revisionflag="added">
<para>
<emphasis>The GNU C Library Project.</emphasis>
<emphasis>
<link xlink:href="https://www.gnu.org/software/libc">https://www.gnu.org/software/libc</link>
</emphasis>
</para>
</listitem>
<listitem revisionflag="added">
<para>
<emphasis>Matrix-Multiply Assist Best Practices Guide.</emphasis>
<emphasis>
<link xlink:href="http://www.redbooks.ibm.com/redpapers/pdfs/redp5612.pdf">https://www.redbooks.ibm.com/redpapers/pdfs/redp5612.pdf</link>
</emphasis>
</para>
</listitem>
</itemizedlist> </itemizedlist>
</section> </section>



@ -19,7 +19,7 @@
revisionflag="added"> revisionflag="added">


<!-- Chapter Title goes here. --> <!-- Chapter Title goes here. -->
<title>Matrix Multiply Accelerate (MMA) Intrinsic Reference</title> <title>Matrix-Multiply Assist (MMA) Intrinsic Reference</title>


<section> <section>
<title>Introduction</title> <title>Introduction</title>
@ -43,8 +43,14 @@
instruction directly writes to one of these VSRs. instruction directly writes to one of these VSRs.
</para> </para>
<para> <para>
<emphasis role="bold">Review status:</emphasis> This chapter is This reference is not intended to be a complete introduction to
not yet reviewed by anyone. MMA concepts. The reader is directed to the Matrix-Multiply
Assist Best Practices Guide (see <xref
linkend="VIPR.intro.links" />) and to the POWER ISA.
</para>
<para>
<emphasis role="bold">Review status:</emphasis> Chapter reviewed
by Paul Clarke; changes made.
</para> </para>
</section> </section>


@ -76,6 +82,14 @@
<para> <para>
Load and store vector pairs. Load and store vector pairs.
</para> </para>
<indexterm>
<primary>lxvp</primary>
<secondary>__builtin_vsx_lxvp</secondary>
</indexterm>
<indexterm>
<primary>stxvp</primary>
<secondary>__builtin_vsx_stxvp</secondary>
</indexterm>
<para> <para>
<informaltable frame="all"> <informaltable frame="all">
<tgroup cols="2"> <tgroup cols="2">
@ -95,7 +109,7 @@
<row> <row>
<entry> <entry>
<programlisting> <programlisting>
__vector pair __builtin_vsx_lxvp (long long int a, const __vector_pair* b) __vector_pair __builtin_vsx_lxvp (long long a, const __vector_pair* b)
</programlisting> </programlisting>
</entry> </entry>
<entry> <entry>
@ -107,7 +121,7 @@
<row> <row>
<entry> <entry>
<programlisting> <programlisting>
void __builtin_vsx_stxvp (__vector_pair s, long long int a, const __vector_pair* b) void __builtin_vsx_stxvp (__vector_pair s, long long a, const __vector_pair* b)
</programlisting> </programlisting>
</entry> </entry>
<entry> <entry>
@ -226,6 +240,18 @@
(a "priming" operation) or vice versa ( a "depriming" (a "priming" operation) or vice versa ( a "depriming"
operation), or initialize an accumulator to zeros. operation), or initialize an accumulator to zeros.
</para> </para>
<indexterm>
<primary>xxmfacc</primary>
<secondary>__builtin_mma_xxmfacc</secondary>
</indexterm>
<indexterm>
<primary>xxmtacc</primary>
<secondary>__builtin_mma_xxmtacc</secondary>
</indexterm>
<indexterm>
<primary>xxsetaccz</primary>
<secondary>__builtin_mma_xxsetaccz</secondary>
</indexterm>
<para> <para>
<informaltable frame="all"> <informaltable frame="all">
<tgroup cols="2"> <tgroup cols="2">
@ -289,6 +315,238 @@
Each of these intrinsics generates an instruction to perform Each of these intrinsics generates an instruction to perform
an outer product operation. an outer product operation.
</para> </para>
<indexterm>
<primary>pmxvbf16ger2</primary>
<secondary>__builtin_mma_pmxvbf16ger2</secondary>
</indexterm>
<indexterm>
<primary>pmxvbf16ger2nn</primary>
<secondary>__builtin_mma_pmxvbf16ger2nn</secondary>
</indexterm>
<indexterm>
<primary>pmxvbf16ger2np</primary>
<secondary>__builtin_mma_pmxvbf16ger2np</secondary>
</indexterm>
<indexterm>
<primary>pmxvbf16ger2pn</primary>
<secondary>__builtin_mma_pmxvbf16ger2pn</secondary>
</indexterm>
<indexterm>
<primary>pmxvbf16ger2pp</primary>
<secondary>__builtin_mma_pmxvbf16ger2pp</secondary>
</indexterm>
<indexterm>
<primary>pmxvf16ger2</primary>
<secondary>__builtin_mma_pmxvf16ger2</secondary>
</indexterm>
<indexterm>
<primary>pmxvf16ger2nn</primary>
<secondary>__builtin_mma_pmxvf16ger2nn</secondary>
</indexterm>
<indexterm>
<primary>pmxvf16ger2np</primary>
<secondary>__builtin_mma_pmxvf16ger2np</secondary>
</indexterm>
<indexterm>
<primary>pmxvf16ger2pn</primary>
<secondary>__builtin_mma_pmxvf16ger2pn</secondary>
</indexterm>
<indexterm>
<primary>pmxvf16ger2pp</primary>
<secondary>__builtin_mma_pmxvf16ger2pp</secondary>
</indexterm>
<indexterm>
<primary>pmxvf32ger</primary>
<secondary>__builtin_mma_pmxvf32ger</secondary>
</indexterm>
<indexterm>
<primary>pmxvf32gernn</primary>
<secondary>__builtin_mma_pmxvf32gernn</secondary>
</indexterm>
<indexterm>
<primary>pmxvf32gernp</primary>
<secondary>__builtin_mma_pmxvf32gernp</secondary>
</indexterm>
<indexterm>
<primary>pmxvf32gerpn</primary>
<secondary>__builtin_mma_pmxvf32gerpn</secondary>
</indexterm>
<indexterm>
<primary>pmxvf32gerpp</primary>
<secondary>__builtin_mma_pmxvf32gerpp</secondary>
</indexterm>
<indexterm>
<primary>pmxvf64ger</primary>
<secondary>__builtin_mma_pmxvf64ger</secondary>
</indexterm>
<indexterm>
<primary>pmxvf64gernn</primary>
<secondary>__builtin_mma_pmxvf64gernn</secondary>
</indexterm>
<indexterm>
<primary>pmxvf64gernp</primary>
<secondary>__builtin_mma_pmxvf64gernp</secondary>
</indexterm>
<indexterm>
<primary>pmxvf64gerpn</primary>
<secondary>__builtin_mma_pmxvf64gerpn</secondary>
</indexterm>
<indexterm>
<primary>pmxvf64gerpp</primary>
<secondary>__builtin_mma_pmxvf64gerpp</secondary>
</indexterm>
<indexterm>
<primary>pmxvi64ger2</primary>
<secondary>__builtin_mma_pmxvi64ger2</secondary>
</indexterm>
<indexterm>
<primary>pmxvi64ger2pp</primary>
<secondary>__builtin_mma_pmxvi64ger2pp</secondary>
</indexterm>
<indexterm>
<primary>pmxvi64ger2s</primary>
<secondary>__builtin_mma_pmxvi64ger2s</secondary>
</indexterm>
<indexterm>
<primary>pmxvi64ger2spp</primary>
<secondary>__builtin_mma_pmxvi64ger2spp</secondary>
</indexterm>
<indexterm>
<primary>pmxvi4ger8</primary>
<secondary>__builtin_mma_pmxvi4ger8</secondary>
</indexterm>
<indexterm>
<primary>pmxvi4ger8pp</primary>
<secondary>__builtin_mma_pmxvi4ger8pp</secondary>
</indexterm>
<indexterm>
<primary>pmxvi8ger4</primary>
<secondary>__builtin_mma_pmxvi8ger4</secondary>
</indexterm>
<indexterm>
<primary>pmxvi8ger4pp</primary>
<secondary>__builtin_mma_pmxvi8ger4pp</secondary>
</indexterm>
<indexterm>
<primary>pmxvi8ger4spp</primary>
<secondary>__builtin_mma_pmxvi8ger4spp</secondary>
</indexterm>
<indexterm>
<primary>xvbf16ger2</primary>
<secondary>__builtin_mma_xvbf16ger2</secondary>
</indexterm>
<indexterm>
<primary>xvbf16ger2nn</primary>
<secondary>__builtin_mma_xvbf16ger2nn</secondary>
</indexterm>
<indexterm>
<primary>xvbf16ger2np</primary>
<secondary>__builtin_mma_xvbf16ger2np</secondary>
</indexterm>
<indexterm>
<primary>xvbf16ger2pn</primary>
<secondary>__builtin_mma_xvbf16ger2pn</secondary>
</indexterm>
<indexterm>
<primary>xvbf16ger2pp</primary>
<secondary>__builtin_mma_xvbf16ger2pp</secondary>
</indexterm>
<indexterm>
<primary>xvf16ger2</primary>
<secondary>__builtin_mma_xvf16ger2</secondary>
</indexterm>
<indexterm>
<primary>xvf16ger2nn</primary>
<secondary>__builtin_mma_xvf16ger2nn</secondary>
</indexterm>
<indexterm>
<primary>xvf16ger2np</primary>
<secondary>__builtin_mma_xvf16ger2np</secondary>
</indexterm>
<indexterm>
<primary>xvf16ger2pn</primary>
<secondary>__builtin_mma_xvf16ger2pn</secondary>
</indexterm>
<indexterm>
<primary>xvf16ger2pp</primary>
<secondary>__builtin_mma_xvf16ger2pp</secondary>
</indexterm>
<indexterm>
<primary>xvf32ger</primary>
<secondary>__builtin_mma_xvf32ger</secondary>
</indexterm>
<indexterm>
<primary>xvf32gernn</primary>
<secondary>__builtin_mma_xvf32gernn</secondary>
</indexterm>
<indexterm>
<primary>xvf32gernp</primary>
<secondary>__builtin_mma_xvf32gernp</secondary>
</indexterm>
<indexterm>
<primary>xvf32gerpn</primary>
<secondary>__builtin_mma_xvf32gerpn</secondary>
</indexterm>
<indexterm>
<primary>xvf32gerpp</primary>
<secondary>__builtin_mma_xvf32gerpp</secondary>
</indexterm>
<indexterm>
<primary>xvf64ger</primary>
<secondary>__builtin_mma_xvf64ger</secondary>
</indexterm>
<indexterm>
<primary>xvf64gernn</primary>
<secondary>__builtin_mma_xvf64gernn</secondary>
</indexterm>
<indexterm>
<primary>xvf64gernp</primary>
<secondary>__builtin_mma_xvf64gernp</secondary>
</indexterm>
<indexterm>
<primary>xvf64gerpn</primary>
<secondary>__builtin_mma_xvf64gerpn</secondary>
</indexterm>
<indexterm>
<primary>xvf64gerpp</primary>
<secondary>__builtin_mma_xvf64gerpp</secondary>
</indexterm>
<indexterm>
<primary>xvi16ger2</primary>
<secondary>__builtin_mma_xvi16ger2</secondary>
</indexterm>
<indexterm>
<primary>xvi16ger2pp</primary>
<secondary>__builtin_mma_xvi16ger2pp</secondary>
</indexterm>
<indexterm>
<primary>xvi16ger2s</primary>
<secondary>__builtin_mma_xvi16ger2s</secondary>
</indexterm>
<indexterm>
<primary>xvi16ger2spp</primary>
<secondary>__builtin_mma_xvi16ger2spp</secondary>
</indexterm>
<indexterm>
<primary>xvi4ger8</primary>
<secondary>__builtin_mma_xvi4ger8</secondary>
</indexterm>
<indexterm>
<primary>xvi4ger8pp</primary>
<secondary>__builtin_mma_xvi4ger8pp</secondary>
</indexterm>
<indexterm>
<primary>xvi8ger4</primary>
<secondary>__builtin_mma_xvi8ger4</secondary>
</indexterm>
<indexterm>
<primary>xvi8ger4pp</primary>
<secondary>__builtin_mma_xvi8ger4pp</secondary>
</indexterm>
<indexterm>
<primary>xvi8ger4spp</primary>
<secondary>__builtin_mma_xvi8ger4spp</secondary>
</indexterm>
<para> <para>
<informaltable frame="all"> <informaltable frame="all">
<tgroup cols="2"> <tgroup cols="2">

@ -113,9 +113,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_techniques">
references. (<code>restrict</code> can be used only in C references. (<code>restrict</code> can be used only in C
when compiling for the C99 standard or later. when compiling for the C99 standard or later.
<code>__restrict__</code> is a language extension, available <code>__restrict__</code> is a language extension, available
in GCC, Clang, and the XL compilers, that can be used in GCC, Clang, and the XL <phrase revisionflag="added">and
without restriction for both C and C++. See your compiler's Open XL</phrase> compilers, that can be used without
user manual for details.) restriction for both C and C++. See your compiler's user
manual for details.)
</para> </para>
<para> <para>
Suppose you have a function that takes two pointer Suppose you have a function that takes two pointer
@ -142,8 +143,9 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_techniques">
</para> </para>
<para> <para>
This reference provides intrinsics that are guaranteed to be This reference provides intrinsics that are guaranteed to be
portable across compliant compilers. In particular, both the portable across compliant compilers. In particular, the <phrase
GCC and Clang compilers for Power implement the intrinsics in revisionflag="changed">GCC, Clang, and Open XL</phrase>
compilers for Power implement the intrinsics in
this manual. The compilers may each implement many more this manual. The compilers may each implement many more
intrinsics, but the ones in this manual are the only ones intrinsics, but the ones in this manual are the only ones
guaranteed to be portable. So if you are using an interface not guaranteed to be portable. So if you are using an interface not
@ -204,10 +206,15 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_techniques">
responsible for following the calling conventions established by responsible for following the calling conventions established by
the ABI (see <xref linkend="VIPR.intro.links" />). Again, it is the ABI (see <xref linkend="VIPR.intro.links" />). Again, it is
best to look at examples. One place to find well-written best to look at examples. One place to find well-written
<code>.S</code> files is in the GLIBC project. You can also <code>.S</code> files is in the <phrase
revisionflag="changed">GNU C Library project (see <xref
linkend="VIPR.intro.links" />).</phrase> You can also
study the assembly output from your favorite compiler, which can study the assembly output from your favorite compiler, which can
be obtained with the <code>-S</code> or similar option, or by be obtained with the <code>-S</code> or similar option, or by
using the <emphasis role="bold">objdump</emphasis> utility. using the <emphasis role="bold">objdump</emphasis> utility:
</para>
<para revisionflag="added">
<programlisting> objdump -dr &lt;binary or object file&gt;</programlisting>
</para> </para>
</section> </section>


@ -219,7 +226,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_techniques">
<section> <section>
<title>x86 Vector Portability Headers</title> <title>x86 Vector Portability Headers</title>
<para> <para>
Recent versions of the GCC and Clang open-source compilers Recent versions of the <phrase revisionflag="changed">GCC,
Clang, and Open XL</phrase> compilers
for Power provide "drop-in" portability headers for portions for Power provide "drop-in" portability headers for portions
of the Intel Architecture Instruction Set Extensions (see <xref of the Intel Architecture Instruction Set Extensions (see <xref
linkend="VIPR.intro.links" />). These headers mirror the APIs linkend="VIPR.intro.links" />). These headers mirror the APIs
@ -243,14 +251,18 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_techniques">
<para> <para>
Access to the portability APIs occurs automatically when Access to the portability APIs occurs automatically when
including one of the corresponding Intel header files, such as including one of the corresponding Intel header files, such as
<code>&lt;mmintrin.h&gt;</code>. <code>&lt;mmintrin.h&gt;</code>. <phrase
revisionflag="added">You must also compile with
<code>-DNO_WARN_X86_INTRINSICS</code> to opt into using the
headers.</phrase>
</para> </para>
</section> </section>
<section xml:id="VIPR.techniques.pveclib"> <section xml:id="VIPR.techniques.pveclib">
<title>The Power Vector Library (pveclib)</title> <title>The Power Vector Library (pveclib)</title>
<para>The Power Vector Library, also known as <para>The Power Vector Library, also known as
<code>pveclib</code>, is a separate project available from <code>pveclib</code>, is a separate project available from
github (see <xref linkend="VIPR.intro.links" />). The <phrase revisionflag="changed">GitHub</phrase> (see <xref
linkend="VIPR.intro.links" />). The
<code>pveclib</code> project builds on top of the intrinsics <code>pveclib</code> project builds on top of the intrinsics
described in this manual to provide higher-level vector described in this manual to provide higher-level vector
interfaces that are highly portable. The goals of the project interfaces that are highly portable. The goals of the project

File diff suppressed because it is too large Load Diff
Loading…
Cancel
Save