@ -762,369 +762,391 @@ a[3] = c;</programlisting>
endian-sensitive built-in functions can be found in <xref
linkend="VIPR.biendian.sensitive" />.
</para>
<section>
<title>Extended Data Movement Functions</title>
<para>
The built-in functions in <xref
linkend="VIPR.biendian.vmx-mem" /> map to Altivec/VMX load and
store instructions and provide access to the “auto-aligning”
memory instructions of the VMX ISA where low-order address
bits are discarded before performing a memory access. These
instructions load and store data in accordance with the
program's current endian mode, and do not need to be adapted
by the compiler to reflect little-endian operation during code
generation.
</para>
<para>
Before the bi-endian programming model was introduced, the
<code>vec_lvsl</code> and <code>vec_lvsr</code> intrinsics
were supported. These could be used in conjunction with
<code>vec_perm</code> and VMX load and store instructions for
unaligned access. The <code>vec_lvsl</code> and
<code>vec_lvsr</code> interfaces are deprecated in accordance
with the interfaces specified here. For compatibility, the
built-in pseudo sequences published in previous VMX documents
continue to work with little-endian data layout and the
little-endian vector layout described in this document.
However, the use of these sequences in new code is discouraged
and usually results in worse performance. It is recommended
that compilers issue a warning when these functions are used
in little-endian environments.
</para>
<table frame="all" pgwide="1" xml:id="VIPR.biendian.vmx-mem">
<title>VMX Memory Access Built-In Functions</title>
<table frame="all" pgwide="1" xml:id="VIPR.biendian.sensitive">
<title>Endian-Sensitive Built-In Functions</title>
<tgroup cols="3">
<colspec colname="c1" colwidth="15*" align="center" />
<colspec colname="c2" colwidth="35*" align="center" />
<colspec colname="c3" colwidth="50*" />
<thead>
<colspec colname="c2" colwidth="15*" align="center" />
<colspec colname="c3" colwidth="15*" align="center" />
<tbody>
<row>
<entry>
<para>
<emphasis role="bold">Built-in Function</emphasis>
</para>
<para>vec_bperm</para>
</entry>
<entry>
<para>
<emphasis role="bold">Corresponding Power
Instructions</emphasis>
</para>
<para>vec_mergeh</para>
</entry>
<entry align="center">
<para>
<emphasis role="bold">Implementation Notes</emphasis>
</para>
<entry>
<para>vec_signedo</para>
</entry>
</row>
</thead>
<tbody>
<row>
<entry>
<para>vec_ld</para>
<para>vec_cipher_be</para>
</entry>
<entry>
<para>lvx</para>
<para>vec_mergel</para>
</entry>
<entry>
<para>Hardware works as a function of endian mode.</para>
<para>vec_sld</para>
</entry>
</row>
<row>
<entry>
<para>vec_lde</para>
<para>vec_cipherlast_be</para>
</entry>
<entry>
<para>lvebx, lvehx, lvewx</para>
<para>vec_mergeo</para>
</entry>
<entry>
<para>Hardware works as a function of endian mode.</para>
<para>vec_sldw</para>
</entry>
</row>
<row>
<entry>
<para>vec_ldl</para>
<para>vec_doublee</para>
</entry>
<entry>
<para>lvxl</para>
<para>vec_mfvscr</para>
</entry>
<entry>
<para>Hardware works as a function of endian mode.</para>
<para>vec_sll</para>
</entry>
</row>
<row>
<entry>
<para>vec_st</para>
<para>vec_doubleh</para>
</entry>
<entry>
<para>stvx</para>
<para>vec_mule</para>
</entry>
<entry>
<para>Hardware works as a function of endian mode.</para>
<para>vec_slo</para>
</entry>
</row>
<row>
<entry>
<para>vec_ste</para>
<para>vec_doublel</para>
</entry>
<entry>
<para>stvebx, stvehx, stvewx</para>
<para>vec_mulo</para>
</entry>
<entry>
<para>Hardware works as a function of endian mode.</para>
<para>vec_slv</para>
</entry>
</row>
<row>
<entry>
<para>vec_stl</para>
<para>vec_doubleo</para>
</entry>
<entry>
<para>stvxl</para>
<para>vec_ncipher_be</para>
</entry>
<entry>
<para>Hardware works as a function of endian mode.</para>
<para>vec_splat</para>
</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
Instead, it is recommended that programmers use the
<code>vec_xl</code> and <code>vec_xst</code> vector built-in
functions to access unaligned data streams. See the
descriptions of these instructions in <xref
linkend="VIPR.vec-ref" /> for further description and
implementation details.
</para>
<table frame="all" pgwide="1" xml:id="VIPR.biendian.sensitive">
<title>Endian-Sensitive Built-In Functions</title>
<tgroup cols="3">
<colspec colname="c1" colwidth="15*" align="center" />
<colspec colname="c2" colwidth="15*" align="center" />
<colspec colname="c3" colwidth="15*" align="center" />
<tbody>
<row>
<entry>
<para>vec_bperm</para>
<para>vec_extract</para>
</entry>
<entry>
<para>vec_mergeo</para>
<para>vec_ncipherlast_be</para>
</entry>
<entry>
<para>vec_sld</para>
<para>vec_srl</para>
</entry>
</row>
<row>
<entry>
<para>vec_cipher_be</para>
<para>vec_extract_fp32_from_shorth</para>
</entry>
<entry>
<para>vec_mfvscr</para>
<para>vec_pack</para>
</entry>
<entry>
<para>vec_sldw</para>
<para>vec_sro</para>
</entry>
</row>
<row>
<entry>
<para>vec_cipherlast_be</para>
<para>vec_extract_fp32_from_shortl</para>
</entry>
<entry>
<para>vec_mule</para>
<para>vec_pack_to_short_fp32</para>
</entry>
<entry>
<para>vec_sll</para>
<para>vec_srv</para>
</entry>
</row>
<row>
<entry>
<para>vec_doublee</para>
<para>vec_extract4b</para>
</entry>
<entry>
<para>vec_mulo</para>
<para>vec_packpx</para>
</entry>
<entry>
<para>vec_slo</para>
<para>vec_sum2s</para>
</entry>
</row>
<row>
<entry>
<para>vec_doubleh</para>
<para>vec_first_match_index</para>
</entry>
<entry>
<para>vec_ncipher_be</para>
<para>vec_packs</para>
</entry>
<entry>
<para>vec_slv</para>
<para>vec_sums</para>
</entry>
</row>
<row>
<entry>
<para>vec_doublel</para>
<para>vec_first_match_or_eos_index</para>
</entry>
<entry>
<para>vec_ncipherlast_be</para>
<para>vec_packsu</para>
</entry>
<entry>
<para>vec_splat</para>
<para>vec_unpackh</para>
</entry>
</row>
<row>
<entry>
<para>vec_doubleo</para>
<para>vec_first_mismatch_index</para>
</entry>
<entry>
<para>vec_pack</para>
<para>vec_perm</para>
</entry>
<entry>
<para>vec_srl</para>
<para>vec_unpackl</para>
</entry>
</row>
<row>
<entry>
<para>vec_extract</para>
<para>vec_first_mismatch_or_eos_index</para>
</entry>
<entry>
<para>vec_pack_to_short_fp32</para>
<para>vec_permxor</para>
</entry>
<entry>
<para>vec_sro</para>
<para>vec_unsigned2</para>
</entry>
</row>
<row>
<entry>
<para>vec_extract_fp32_from_shorth</para>
<para>vec_float2</para>
</entry>
<entry>
<para>vec_packpx</para>
<para>vec_pmsum_be</para>
</entry>
<entry>
<para>vec_srv</para>
<para>vec_unsignede</para>
</entry>
</row>
<row>
<entry>
<para>vec_extract_fp32_from_shortl</para>
<para>vec_floate</para>
</entry>
<entry>
<para>vec_packs</para>
<para>vec_reve</para>
</entry>
<entry>
<para>vec_sum2s</para>
<para>vec_unsignedo</para>
</entry>
</row>
<row>
<entry>
<para>vec_extract_4b</para>
<para>vec_floato</para>
</entry>
<entry>
<para>vec_packsu</para>
<para>vec_sbox_be</para>
</entry>
<entry>
<para>vec_sums</para>
<para>vec_xl (ISA 2.07 only)</para>
</entry>
</row>
<row>
<entry>
<para>vec_float2</para>
<para>vec_gb</para>
</entry>
<entry>
<para>vec_perm</para>
<para>vec_shasigma_be</para>
</entry>
<entry>
<para>vec_unpackh</para>
<para>vec_xl_be</para>
</entry>
</row>
<row>
<entry>
<para>vec_floate</para>
<para>vec_insert</para>
</entry>
<entry>
<para>vec_permxor</para>
<para>vec_signed2</para>
</entry>
<entry>
<para>vec_unpackl</para>
<para>vec_xst (ISA 2.07 only)</para>
</entry>
</row>
<row>
<entry>
<para>vec_floato</para>
<para>vec_insert4b</para>
</entry>
<entry>
<para>vec_pmsum_be</para>
<para>vec_signede</para>
</entry>
<entry>
<para>vec_unsigned2</para>
<para>vec_xst_be</para>
</entry>
</row>
<row>
<entry>
<para>vec_gb</para>
<para>vec_mergee</para>
</entry>
<entry>
<para>vec_reve</para>
<para></para>
</entry>
<entry>
<para>vec_unsignede</para>
<para></para>
</entry>
</row>
</tbody>
</tgroup>
</table>
<section>
<title>Extended Data Movement Functions</title>
<para>
The built-in functions in <xref
linkend="VIPR.biendian.vmx-mem" /> map to Altivec/VMX load and
store instructions and provide access to the “auto-aligning”
memory instructions of the VMX ISA where low-order address
bits are discarded before performing a memory access. These
instructions load and store data in accordance with the
program's current endian mode, and do not need to be adapted
by the compiler to reflect little-endian operation during code
generation.
</para>
<para>
Before the bi-endian programming model was introduced, the
<code>vec_lvsl</code> and <code>vec_lvsr</code> intrinsics
were supported. These could be used in conjunction with
<code>vec_perm</code> and VMX load and store instructions for
unaligned access. The <code>vec_lvsl</code> and
<code>vec_lvsr</code> interfaces are deprecated in accordance
with the interfaces specified here. For compatibility, the
built-in pseudo sequences published in previous VMX documents
continue to work with little-endian data layout and the
little-endian vector layout described in this document.
However, the use of these sequences in new code is discouraged
and usually results in worse performance. It is recommended
that compilers issue a warning when these functions are used
in little-endian environments.
</para>
<table frame="all" pgwide="1" xml:id="VIPR.biendian.vmx-mem">
<title>VMX Memory Access Built-In Functions</title>
<tgroup cols="3">
<colspec colname="c1" colwidth="15*" align="center" />
<colspec colname="c2" colwidth="35*" align="center" />
<colspec colname="c3" colwidth="50*" />
<thead>
<row>
<entry>
<para>
<emphasis role="bold">Built-in Function</emphasis>
</para>
</entry>
<entry>
<para>
<emphasis role="bold">Corresponding Power
Instructions</emphasis>
</para>
</entry>
<entry align="center">
<para>
<emphasis role="bold">Implementation Notes</emphasis>
</para>
</entry>
</row>
</thead>
<tbody>
<row>
<entry>
<para>vec_insert</para>
<para>vec_ld</para>
</entry>
<entry>
<para>vec_sbox_be</para>
<para>lvx</para>
</entry>
<entry>
<para>vec_unsignedo</para>
<para>Hardware works as a function of endian mode.</para>
</entry>
</row>
<row>
<entry>
<para>vec_insert_4b</para>
<para>vec_lde</para>
</entry>
<entry>
<para>vec_shasigma_be</para>
<para>lvebx, lvehx, lvewx</para>
</entry>
<entry>
<para>vec_xl (ISA 2.07 only)</para>
<para>Hardware works as a function of endian mode.</para>
</entry>
</row>
<row>
<entry>
<para>vec_mergee</para>
<para>vec_ldl</para>
</entry>
<entry>
<para>vec_signed2</para>
<para>lvxl</para>
</entry>
<entry>
<para>vec_xl_be</para>
<para>Hardware works as a function of endian mode.</para>
</entry>
</row>
<row>
<entry>
<para>vec_mergeh</para>
<para>vec_st</para>
</entry>
<entry>
<para>vec_signede</para>
<para>stvx</para>
</entry>
<entry>
<para>vec_xst (ISA 2.07 only)</para>
<para>Hardware works as a function of endian mode.</para>
</entry>
</row>
<row>
<entry>
<para>vec_mergel</para>
<para>vec_ste</para>
</entry>
<entry>
<para>vec_signedo</para>
<para>stvebx, stvehx, stvewx</para>
</entry>
<entry>
<para>vec_xst_be</para>
<para>Hardware works as a function of endian mode.</para>
</entry>
</row>
<row>
<entry>
<para>vec_stl</para>
</entry>
<entry>
<para>stvxl</para>
</entry>
<entry>
<para>Hardware works as a function of endian mode.</para>
</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
Instead, it is recommended that programmers use the
<code>vec_xl</code> and <code>vec_xst</code> vector built-in
functions to access unaligned data streams. See the
descriptions of these instructions in <xref
linkend="VIPR.vec-ref" /> for further description and
implementation details.
</para>
</section>
<section xml:id="VIPR.biendian.BELE">
<title>Big-Endian Vector Layout in Little-Endian Environments