You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Programming-Guides/Porting_Vector_Intrinsics/sec_intel_intrinsic_types.xml

102 lines
4.9 KiB
XML

This file contains invisible Unicode characters!

This file contains invisible Unicode characters that may be processed differently from what appears below. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to reveal hidden characters.

<?xml version="1.0" encoding="UTF-8"?>
<!--
Copyright (c) 2017 OpenPOWER Foundation
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<section xmlns="http://docbook.org/ns/docbook"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:xlink="http://www.w3.org/1999/xlink"
version="5.0"
xml:id="sec_intel_intrinsic_types">
<title>The types used for intrinsics</title>
<para>The type system for Intel intrinsics is a little strange. For example
from xmmintrin.h:
<programlisting><![CDATA[/* The Intel API is flexible enough that we must allow aliasing with other
vector types, and their scalar components. */
typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__));
/* Internal data types for implementing the intrinsics. */
typedef float __v4sf __attribute__ ((__vector_size__ (16)));]]></programlisting></para>
<para>So there is one set of types that are used in the function prototypes
of the API, and the internal types that are used in the implementation. Notice
the special attribute <literal>__may_alias__</literal>. From the GCC documentation:
<blockquote><para>
Accesses through pointers to types with this attribute are not subject
to type-based alias analysis, but are instead assumed to be able to alias any
other type of objects. ... This extension exists to support some vector APIs,
in which pointers to one vector type are permitted to alias pointers to a
different vector type.</para></blockquote></para>
<para>There are a couple of issues here:
<itemizedlist spacing="compact">
<listitem>
<para>The API seems to force the compiler to assume
aliasing of any parameter passed by reference.</para>
</listitem>
<listitem>
<para>The data type used at the interface may not be
the correct type for the implied operation.</para>
</listitem>
</itemizedlist>
Normally the compiler assumes that parameters of different size do
not overlap in storage, which allows more optimization.
However parameters for different vector element sizes
[char | short | int | long] are all passed and returned as type <literal>__m128i</literal>
(defined as vector long long). </para>
<para>This may not matter when using x86 built-ins but does matter when
the implementation uses C vector extensions or in our case uses PowerPC generic
vector built-ins
(<xref linkend="sec_powerisa_vector_intrinsics"/>).
For the latter cases the type must be correct for
the compiler to generate the correct type (char, short, int, long)
(<xref linkend="sec_api_implemented"/>) for the generic
builtin operation. There is also concern that excessive use of
<literal>__may_alias__</literal>
will limit compiler optimization. We are not sure how important this attribute
is to the correct operation of the API.  So at a later stage we should
experiment with removing it from our implementation for PowerPC.</para>
<para>The good news is that PowerISA has good support for 128-bit vectors
and (with the addition of VSX) all the required vector data (char, short, int,
long, float, double) types. However Intel supports a wider variety of the
vector sizes  than PowerISA does. This started with the 64-bit MMX vector
support that preceded SSE and extends to 256-bit and 512-bit vectors of AVX,
AVX2, and AVX512 that followed SSE.</para>
<para>Within the GCC Intel intrinsic implementation these are all
implemented as vector attribute extensions of the appropriate  size (  
<literal>__vector_size__</literal> ({8 | 16 | 32, and 64}). For the PowerPC target  GCC currently
only supports the native <literal>__vector_size__</literal> ( 16 ). These we can support directly
in VMX/VSX registers and associated instructions. GCC will compile code with
other   <literal>__vector_size__</literal> values, but the resulting types are treated as simple
arrays of the element type. This does not allow the compiler to use the vector
registers and vector instructions for these (nonnative) vectors.</para>
<para>So the PowerISA VMX/VSX facilities and GCC compiler support for
128-bit/16-byte vectors and associated vector built-ins
are well matched to implementing equivalent X86 SSE intrinsic functions.
However implementing the older MMX (64-bit) and the latest
AVX (256 / 512-bit) extensions requires more thought and some ingenuity.</para>
<xi:include href="sec_handling_mmx.xml"/>
<xi:include href="sec_handling_avx.xml"/>
</section>