diff --git a/Intrinsics_Reference/ch_biendian.xml b/Intrinsics_Reference/ch_biendian.xml index 9585d06..b79099a 100644 --- a/Intrinsics_Reference/ch_biendian.xml +++ b/Intrinsics_Reference/ch_biendian.xml @@ -80,9 +80,9 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.biendian"> __vector, __pixel, and __bool. These keywords are used to specify vector data types (). Because - these identifiers may conflict with keywords in more recent C - and C++ language standards, compilers may implement these in one - of two ways. + these identifiers may conflict with keywords in more recent + language standards for C and C++, compilers may implement these + in one of two ways. @@ -104,6 +104,16 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.biendian"> + + As a motivating example, the vector token is used as a type in the + C++ Standard Template Library, and hence cannot be used as an + unrestricted keyword, but can be used in the context-sensitive + implementation. For example, vector + char is distinct from std::vector in the context-sensitive + implementation. + Vector literals may be specified using a type cast and a set of literal initializers in parentheses or braces. For example, @@ -129,16 +139,15 @@ vector double g = (vector double) { 3.5, -24.6 }; For the C and C++ programming languages (and related/derived - languages), these data types may be accessed based on the type - names listed in when - Power SIMD language extensions are enabled using either the - vector or __vector keywords. Note - that the ELFv2 ABI for Power also includes a vector - _Float16 data type. However, no Power compilers have yet - implemented such a type, and it is not clear that this will - change anytime soon. Thus this document has removed the - vector _Float16 data type, and all intrinsics that - reference it. + languages), the "Power SIMD C Types" listed in the leftmost + column of may be used + when Power SIMD language extensions are enabled. Either + vector or __vector may be used in the + type name. Note that the ELFv2 ABI for Power also includes a + vector _Float16 data type. As of this writing, no + current compilers for Power have implemented such a type. This + document does not include that type or any intrinsics related to + it. For the Fortran language, Pointers to vector types are defined like pointers of other C/C++ types. Pointers to vector objects may be defined to have const and volatile properties. Pointers to vector objects must - be divisible by 16, as vector objects are always aligned on - quadword (128-bit) boundaries. + be addresses divisible by 16, as vector objects are always + aligned on quadword (16-byte, or 128-bit) boundaries. The preferred way to access vectors at an application-defined @@ -172,7 +181,8 @@ vector double g = (vector double) { 3.5, -24.6 }; not be used to access data that is not aligned at least to a quadword boundary. Built-in functions such as vec_xl and vec_xst are - provided for unaligned data access. + provided for unaligned data access. Please refer to for an example. One vector type may be cast to another vector type without @@ -182,7 +192,8 @@ vector double g = (vector double) { 3.5, -24.6 }; Compilers are expected to recognize and optimize multiple operations that can be optimized into a single hardware - instruction. For example, a load and splat hardware instruction + instruction. For example, a load-and-splat hardware instruction + (such as lxvdsx) might be generated for the following sequence: double *double_ptr; @@ -484,35 +495,55 @@ register vector double vd = vec_splats(*double_ptr); The traditional C/C++ operators are defined on vector types - with “do all” semantics for unary and binary +, + for unary and binary +, unary and binary –, binary *, binary %, and binary / as well as the unary and binary shift, logical and comparison operators, and the - ternary ?: operator. + ternary ?: operator. These operators perform their + operations "elementwise" on the base elements of the operands, + as follows. For unary operators, the specified operation is performed on - the corresponding base element of the single operand to derive - the result value for each vector element of the vector + each base element of the single operand to derive the result + value placed into the corresponding element of the vector result. The result type of unary operations is the type of the - single input operand. + single operand. For example, + + vector signed int a, b; +a = -b; + + produces the same result as + vector signed int a, b; +a = vec_neg (b); For binary operators, the specified operation is performed on - the corresponding base elements of both operands to derive the - result value for each vector element of the vector - result. Both operands of the binary operators must have the - same vector type with the same base element type. The result - of binary operators is the same type as the type of the input - operands. - + corresponding base elements of both operands to derive the + result value for each vector element of the vector result. Both + operands of the binary operators must have the same vector type + with the same base element type. The result of binary operators + is the same type as the type of the operands. For example, + + vector signed int a, b; +a = a + b; + + produces the same result as + + vector signed int a, b; +a = vec_add (a, b); Further, the array reference operator may be applied to vector data types, yielding an l-value corresponding to the specified element in accordance with the vector element numbering rules (see ). An l-value may either - be assigned a new value or accessed for reading its value. + be assigned a new value or accessed for reading its value. For + example, + vector signed int a; +signed int b, c; +b = a[0]; +a[3] = c;
@@ -584,6 +615,12 @@ register vector double vd = vec_splats(*double_ptr); + + This is no longer as useful as it once was. The primary use + case was for big-endian vector layout in little-endian + environments, which is now deprecated as discussed in . + Note that each element in a vector has the same representation @@ -632,7 +669,7 @@ register vector double vd = vec_splats(*double_ptr); compiler implementation for both BE and LE. These sample implementations are only intended as examples; designers of a compiler are free to use other methods to implement the - specified semantics as they see fit. + specified semantics.
Extended Data Movement Functions @@ -642,7 +679,7 @@ register vector double vd = vec_splats(*double_ptr); store instructions and provide access to the “auto-aligning” memory instructions of the VMX ISA where low-order address bits are discarded before performing a memory access. These - instructions access load and store data in accordance with the + instructions load and store data in accordance with the program's current endian mode, and do not need to be adapted by the compiler to reflect little-endian operation during code generation. @@ -744,31 +781,31 @@ register vector double vd = vec_splats(*double_ptr); - Previous versions of the VMX built-in functions defined - intrinsics to access the VMX instructions lvsl - and lvsr, which could be used in conjunction with + Before the bi-endian programming model was introduced, the + vec_lvsl and vec_lvsr intrinsics + were supported. These could be used in conjunction with vec_perm and VMX load and store instructions for unaligned access. The vec_lvsl and vec_lvsr interfaces are deprecated in accordance with the interfaces specified here. For compatibility, the built-in pseudo sequences published in previous VMX documents continue to work with little-endian data layout and the - little-endian vector layout described in this - document. However, the use of these sequences in new code is - discouraged and usually results in worse performance. It is - recommended (but not required) that compilers issue a warning - when these functions are used in little-endian - environments. + little-endian vector layout described in this document. + However, the use of these sequences in new code is discouraged + and usually results in worse performance. It is recommended + that compilers issue a warning when these functions are used + in little-endian environments. - It is recommended that programmers use the vec_xl - and vec_xst vector built-in functions to access - unaligned data streams. See the descriptions of these - instructions in for further - description and implementation details. + Instead, it is recommended that programmers use the + vec_xl and vec_xst vector built-in + functions to access unaligned data streams. See the + descriptions of these instructions in for further description and + implementation details.
-
+
Big-Endian Vector Layout in Little-Endian Environments (Deprecated) @@ -1047,7 +1084,7 @@ register vector double vd = vec_splats(*double_ptr);
Examples and Limitations -
+
Unaligned vector access A common programming error is to cast a pointer to a base type @@ -1070,8 +1107,8 @@ register vector double vd = vec_splats(*double_ptr); int a[4096]; vector int x = vec_xl (0, a);
-
- vec_sld is not bi-endian +
+ vec_sld and vec_sro are not bi-endian One oddity in the bi-endian vector programming model is that vec_sld has big-endian semantics for code @@ -1099,7 +1136,7 @@ register vector double vd = vec_splats(*double_ptr); vec_sro is not bi-endian for similar reasons.
-
+
Limitations on bi-endianness of vec_perm The vec_perm intrinsic is bi-endian, provided diff --git a/Intrinsics_Reference/ch_intro.xml b/Intrinsics_Reference/ch_intro.xml index 9e5d38d..9c959fa 100644 --- a/Intrinsics_Reference/ch_intro.xml +++ b/Intrinsics_Reference/ch_intro.xml @@ -72,8 +72,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_intro"> IBM extended VMX by introducing the Vector-Scalar Extension - (VSX) for the POWER7 family of processors. VSX adds 64 logical - Vector Scalar Registers (VSRs); however, to optimize the amount + (VSX) for the POWER7 family of processors. VSX adds sixty-four + 128-bit vector-scalar registers (VSRs); however, to optimize the amount of per-process register state, the registers overlap with the VRs and the scalar floating-point registers (FPRs) (see ). The VSRs can represent all @@ -88,7 +88,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_intro"> Both the VMX and VSX instruction sets have been expanded for the POWER8 and POWER9 processor families. Starting with POWER8, a VSR can now contain a single 128-bit integer; and starting - with POWER9, a VSR can contain a single 128-bit floating-point + with POWER9, a VSR can contain a single 128-bit IEEE floating-point value. Again, the ISA currently only supports 128-bit operations on values in the VRs. @@ -263,6 +263,26 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_intro"> + + + POWER8 Processor User's Manual for the Single-Chip + Module. + + https://ibm.ent.box.com/s/649rlau0zjcc0yrulqf4cgx5wk3pgbfk + + + + + + + POWER9 Processor User's Manual. + + https://ibm.ent.box.com/s/tmklq90ze7aj8f4n32er1mu3sy9u8k3k + + + + Power Vector Library. @@ -272,6 +292,17 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_intro"> + + + POWER8 In-Core Cryptography: The Unofficial + Guide. + + https://github.com/noloader/POWER8-crypto/blob/master/power8-crypto.pdf + + + + Using the GNU Compiler Collection. diff --git a/Intrinsics_Reference/ch_techniques.xml b/Intrinsics_Reference/ch_techniques.xml index 8706b28..2ed5900 100644 --- a/Intrinsics_Reference/ch_techniques.xml +++ b/Intrinsics_Reference/ch_techniques.xml @@ -113,7 +113,9 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_techniques"> references. (restrict can be used only in C when compiling for the C99 standard or later. __restrict__ is a language extension, available - in both GCC and Clang, that can be used for both C and C++.) + in GCC, Clang, and the XL compilers, that can be used + without restriction for both C and C++. See your compiler's + user manual for details.) Suppose you have a function that takes two pointer @@ -159,8 +161,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_techniques"> ). In particular, the Power Vector Library (see ) provides additional - portability across compiler versions, as well as interfaces that - hide cases where assembly language is needed. + portability across compiler and ISA versions, as well as + interfaces that hide cases where assembly language is needed.
@@ -202,7 +204,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_techniques"> responsible for following the calling conventions established by the ABI (see ). Again, it is best to look at examples. One place to find well-written - .S files is in the GLIBC project. + .S files is in the GLIBC project. You can also + study the assembly output from your favorite compiler, which can + be obtained with the -S or similar option, or by + using the objdump utility.
@@ -214,13 +219,15 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_techniques">
x86 Vector Portability Headers - Recent versions of the GCC and Clang open source compilers - provide "drop-in" portability headers for portions of the - Intel Architecture Instruction Set Extensions (see ). These headers mirror the APIs - of Intel headers having the same names. Support is provided - for the MMX and SSE layers, up through SSE4. At this time, no - support for the AVX layers is envisioned. + of Intel headers having the same names. As of this writing, + support is provided for the MMX and SSE layers, up through + SSE3 and portions of SSE4. No support for the AVX layers is + envisioned. The portability headers are available starting + with GCC 8.1 and Clang 9.0.0. The portability headers provide the same semantics as the diff --git a/Intrinsics_Reference/ch_vec_reference.xml b/Intrinsics_Reference/ch_vec_reference.xml index 970b1b6..c02b5da 100644 --- a/Intrinsics_Reference/ch_vec_reference.xml +++ b/Intrinsics_Reference/ch_vec_reference.xml @@ -104,8 +104,9 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> separate example implementations are shown for each endianness. - The implementations show which vector instructions are used in - the implementation of a particular intrinsic. When trying to + The implementations show a sequence of instructions that may be + used in the implementation of a particular intrinsic, and + usually include vector instructions. When trying to determine which intrinsic to use, it can be useful to have a cross-reference from a specific vector instruction to the intrinsics whose implementations make use of it. This manual @@ -129,19 +130,6 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> instructions, which may be preferred for portability. - - - Phased in. Not all - compilers have yet completed implementing this form of the - intrinsic. - - - - - Deferred. No compiler yet - supports this form of the intrinsic. - - Deprecated. This form of @@ -150,6 +138,30 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> +
+ Terminology + + Some intrinsic descriptions indicate that either + modular arithmetic or + saturating arithmetic is used. This + refers to what happens when an operation overflows the number + of available bits. A modular operation that overflows + truncates the result on the left, also known as wrapping the + result. A saturating operation that overflows produces the + largest (or smallest) possible result representable in the + output element type. + + + Operands are sometimes represented as having a const + int type. In such cases, the programmer is expected to + provide an integer literal. When the literal has specific + required bounds, this is often represented instead by such + phrases as "5-bit signed literal" or "2-bit unsigned literal" + to specify them. In such cases, compilers are encouraged to + at least warn upon detecting an out-of-range value. Providing + a variable when a literal is required is a compile-time error. + +
@@ -165,7 +177,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: Returns a vector r that contains the - absolute values of the contents of the given vector + absolute values of the contents of the vector a. Result value: @@ -348,7 +360,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> The value of each element of r is the absolute difference of the corresponding elements of a and b, using - modulo arithmetic. + modular arithmetic. Endian considerations: None. @@ -478,7 +490,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: Returns a vector r that contains the - saturated absolute values of the contents of the given vector + saturated absolute values of the contents of the vector a. Result value: @@ -894,7 +906,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Returns a vector of carry bits produced by adding two vectors. + Returns a vector of carries produced by adding two vectors. Result value: The value of each element of r is the @@ -1038,6 +1050,11 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Endian considerations: None. + Notes: + Code generated for this intrinsic should ensure only the + low-order bit of c participates + in the sum. + vspltisw @@ -1191,7 +1208,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Returns a vector of carry bits produced by adding two vectors and + Returns a vector of carries produced by adding two vectors and a carry vector. Result value: @@ -1204,6 +1221,11 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Endian considerations: None. + Notes: + Code generated for this intrinsic should ensure only the + low-order bit of c participates + in the sum. + vspltisw @@ -1549,8 +1571,9 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Tests whether all pairs of corresponding elements of the given vectors - are equal. + Tests whether all elements of a + are equal to the corresponding elements of b. Result value: r is 1 if each element of a is equal to the @@ -1912,10 +1935,11 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> greater than or equal to the corresponding elements of b. - Result value: r is 1 if each - element of a is greater than or equal - to the corresponding element of b. - Otherwise, r is 0. + Result value: r is 1 if + all elements of a are greater + than or equal to the corresponding elements of b. Otherwise, r is 0. Endian considerations: None. @@ -2198,10 +2222,11 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> greater than the corresponding elements of b. - Result value: r is 1 if each - element of a is greater than the - corresponding element of b. Otherwise, - r is 0. + Result value: r is 1 if + all elements of a are greater + than the corresponding elements of b. Otherwise, r is 0. Endian considerations: None. @@ -2480,7 +2505,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Tests whether each element of a given vector is within a given range. + Tests whether all elements of a vector are within a given range. Result value: r is 1 if each element of a has a value less than or @@ -3141,12 +3166,12 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Tests whether each element of a is a - not-a-number (NaN). + Tests whether all elements of a + are not-a-number (NaN). - Result value: r is 1 if each - element of a is a NaN. Otherwise, - r is 0. + Result value: r is 1 if + all elements of a are + NaN. Otherwise, r is 0. Endian considerations: None. @@ -3237,13 +3262,15 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Tests whether all sets of corresponding elements of the given vectors - are not equal. + Tests whether all elements of a + are not equal to the corresponding elements of b. - Result value: r is 1 if each - element of a is not equal to the - corresponding element of b. Otherwise, - r is 0. + Result value: r is 1 if + all elements of a are not equal + to the corresponding elements of b. Otherwise, r is 0. Endian considerations: None. @@ -3596,15 +3623,15 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Tests whether each element of a is not - greater than or equal to the corresponding element of b. + Tests whether all elements of a + are not greater than or equal to the corresponding elements of + b. - Result value: r is 1 if each - element of a is not greater than or - equal to the corresponding element of b. Otherwise, r - is 0. + Result value: r is 1 if + all elements of a are not + greater than or equal to the corresponding elements of b. Otherwise, r is 0. Endian considerations: None. @@ -3707,14 +3734,15 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Tests whether each element of a is not - greater than the corresponding element of a + are not greater than the corresponding elements of b. - Result value: r is 1 if each - element of a is not greater than the - corresponding element of b. Otherwise, - r is 0. + Result value: r is 1 if + all elements of a are not + greater than the corresponding elements of b. Otherwise, r is 0. Endian considerations: None. @@ -3817,14 +3845,15 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Tests whether each element of a is not - less than or equal to the corresponding element of b. + Tests whether all elements of a + are not less than or equal to the corresponding elements of + b. - Result value: r is 1 if each - element of a is not less than or equal - to the corresponding element of b. - Otherwise, r is 0. + Result value: r is 1 if + all elements of a are not less + than or equal to the corresponding elements of b. Otherwise, r is 0. Endian considerations: None. @@ -3927,14 +3956,15 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Tests whether each element of a is not - less than the corresponding element of a + are not less than the corresponding elements of b. - Result value: r is 1 if each - element of a is not less than the - corresponding element of b. Otherwise, - r is 0. + Result value: r is 1 if + all elements of a are not less + than the corresponding elements of b. Otherwise, r is 0. Endian considerations: None. @@ -4037,11 +4067,12 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Tests whether each element of the given vector is numeric (not a NaN). + Tests whether all elements of the vector are numeric (not NaN). - Result value: r is 1 if each - element of a is numeric (not a NaN). - Otherwise, r is 0. + Result value: r is 1 if + all elements of a are numeric + (not NaN). Otherwise, r is + 0. Endian considerations: None. @@ -4362,7 +4393,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -4381,7 +4412,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -4400,7 +4431,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -4685,7 +4716,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -4704,7 +4735,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -4723,7 +4754,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -4778,8 +4809,9 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Tests whether any pair of corresponding elements of the given vectors is - equal. + Tests whether any element of a + is equal to the corresponding element of b. Result value: r is 1 if any element of a is equal to the @@ -6430,7 +6462,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Tests whether any element of the given vector is a NaN. + Tests whether any element of the source vector is a NaN. Result value: r is 1 if any element of a is a NaN. Otherwise, @@ -6537,8 +6569,9 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Tests whether any pair of corresponding elements of the given vectors is - not equal. + Tests whether any element of a + is not equal to the corresponding element of b. Result value: r is 1 if any element of a is not equal to the @@ -7423,7 +7456,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Tests whether any element of the given vector is numeric (not a NaN). + Tests whether any element of the source vector is numeric (not a NaN). Result value: r is 1 if any element of a is numeric (not a NaN). @@ -7530,7 +7563,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Tests whether the value of any element of a given vector is outside of a + Tests whether the value of any element of a vector is outside of a given range. Result value: r is 1 if the @@ -7997,7 +8030,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -8017,7 +8050,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> ISA 3.0 or later - Phased in + @@ -8144,9 +8177,6 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> programming model. - Notes: This intrinsic may - not yet be available in all implementations. - vcipher vec_cipher_be @@ -8232,9 +8262,6 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> programming model. - Notes: This intrinsic may - not yet be available in all implementations. - vcipherlast vec_cipherlast_be @@ -8302,7 +8329,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: Performs a bounds comparison of each set of corresponding elements - of the given vectors. + of two vectors. Result value: Each element of r has the value 0 @@ -8411,7 +8438,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: Returns a vector containing the results of comparing each set of - corresponding elements of the given vectors for equality. + corresponding elements of two vectors for equality. Result value: For each element of r, the value @@ -8720,7 +8747,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: Returns a vector containing the results of a greater-than-or-equal-to - comparison between each set of corresponding elements of the given + comparison between each set of corresponding elements of two vectors. Result value: @@ -8995,7 +9022,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: Returns a vector containing the results of a greater-than - comparison between each set of corresponding elements of the given + comparison between each set of corresponding elements of two vectors. Result value: @@ -9258,7 +9285,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: Returns a vector containing the results of a less-than-or-equal - comparison between each set of corresponding elements of the given + comparison between each set of corresponding elements of two vectors. Result value: @@ -9533,7 +9560,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: Returns a vector containing the results of a less-than - comparison between each set of corresponding elements of the given + comparison between each set of corresponding elements of two vectors. Result value: @@ -9796,7 +9823,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: Returns a vector containing the results of comparing each set of - corresponding elements of the given vectors for inequality. + corresponding elements of two vectors for inequality. Result value: For each element of r, the value @@ -10114,7 +10141,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: Returns a vector containing the results of comparing each set of - corresponding elements of the given vectors for inequality, or for + corresponding elements of two vectors for inequality, or for an element with a zero value. Result value: @@ -10310,7 +10337,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: Returns a vector containing the number of most-significant bits - equal to zero of each corresponding element of the given vector. + equal to zero of each corresponding element of the source vector. Result value: The value of each element of r is @@ -10383,7 +10410,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -10399,7 +10426,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -10415,7 +10442,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -10431,7 +10458,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -10447,7 +10474,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -10463,7 +10490,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -10479,7 +10506,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -10495,7 +10522,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -10613,7 +10640,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: Returns a vector containing the number of least-significant bits - equal to zero of each corresponding element of the given vector. + equal to zero of each corresponding element of the source vector. Result value: The value of each element of r is @@ -10992,7 +11019,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -11011,7 +11038,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -11035,17 +11062,18 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> The value of each element of r is the closest floating-point approximation of the value of the corresponding element of a divided - by 2 to the power of b, which should + by 2 to the power of b, which must be in the range 0–31. Endian considerations: None. Notes: - The example implementations below assume b is zero, so that the scaling code is - omitted. Scaling is accomplished by loading a constant and - multiplying it by the result. + The example implementations below assume b is zero, so that the scaling code + is omitted. Scaling is accomplished by multiplying each + element of r by 2 to the + power of –b. @@ -11097,7 +11125,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector signed int - const int + 5-bit unsigned literal @@ -11113,7 +11141,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector unsigned int - const int + 5-bit unsigned literal @@ -11129,7 +11157,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector signed long long - const int + 5-bit unsigned literal @@ -11145,7 +11173,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector unsigned long long - const int + 5-bit unsigned literal @@ -11175,7 +11203,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> saturated signed-integer value, truncated towards zero, obtained by multiplying the corresponding element of a multiplied by 2 to the power of b, which should be in the range 0–31. + role="bold">b, which must be in the range 0–31. Endian considerations: None. @@ -11226,7 +11254,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector float - const int + 5-bit unsigned literal @@ -11256,7 +11284,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> saturated unsigned-integer value, truncated towards zero, obtained by multiplying the corresponding element of a multiplied by 2 to the power of b, which should be in the range 0–31. + role="bold">b, which must be in the range 0–31. Endian considerations: None. @@ -11307,7 +11335,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector float - const int + 5-bit unsigned literal @@ -11332,7 +11360,6 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: Divides the elements in one vector by the corresponding elements in another vector and places the quotients in the result vector. - Division is emulated using scalar arithmetic for integer types. Result value: The value of each element of r is @@ -11680,7 +11707,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -11703,7 +11730,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -11726,7 +11753,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -11834,7 +11861,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -11859,7 +11886,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -11884,7 +11911,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -11992,7 +12019,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -12017,7 +12044,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -12042,7 +12069,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -12148,7 +12175,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -12171,7 +12198,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -12194,7 +12221,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - Phased in + @@ -13073,7 +13100,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Extracts an exponent from a floating-point number. + Extracts exponents from a vector of floating-point numbers. Result value: Each element of r is extracted @@ -13516,7 +13543,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector unsigned char - const int + const int (range [0,12]) @@ -13552,7 +13579,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> role="bold">b, and returns the first position of equality. - Result value: Returns the element index of the position of the first character match. If no match, returns the number of characters as an element count in the vector argument. + Result value: Returns the + element index of the position of the first character match in + natural element order. If no match, returns the number of + characters as an element count in the vector argument. Endian considerations: None. @@ -13823,10 +13853,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> equality, or the zero string terminator. Result value: Returns the - element index of the position of either the first character - match or an end-of-string (EOS) terminator. If no match or - terminator, returns the number of characters as an element count - in the vector argument. + element index of the position, in natural element order, of + either the first character match or an end-of-string (EOS) + terminator. If no match or terminator, returns the number of + characters as an element count in the vector argument. Endian considerations: None. @@ -14165,9 +14195,9 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> inequality. Result value: Returns the - element index of the position of the first character - mismatch. If no mismatch, returns the number of characters as an - element count in the vector argument. + element index of the position of the first character mismatch in + natural element order. If no mismatch, returns the number of + characters as an element count in the vector argument. Endian considerations: None. @@ -14422,10 +14452,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> inequality, or the zero string terminator. Result value: Returns the - element index of the position of either the first character - mismatch or an end-of-string (EOS) terminator. If no mismatch or - terminator, returns the number of characters as an element count - in the vector argument. + element index of the position, in natural element order, of + either the first character mismatch or an end-of-string (EOS) + terminator. If no mismatch or terminator, returns the number of + characters as an element count in the vector argument. Endian considerations: None. @@ -14844,7 +14874,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Converts two input vectors of long long integers or double-precision + Converts two vectors of long long integers or double-precision floating-point numbers to a vector of single-precision numbers. Result value: Elements of @@ -15014,7 +15044,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Converts the elements of an input vector to single-precision + Converts the elements of a source vector to single-precision floating-point and stores the results in the even elements of the target vector. @@ -15148,7 +15178,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Converts the elements of an input vector to single-precision + Converts the elements of a source vector to single-precision floating-point and stores the results in the odd elements of the target vector. @@ -15284,7 +15314,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: Returns a vector containing the largest representable floating-point integral values less than or equal to the values of the corresponding - elements of the given vector. + elements of the source vector. Result value: Each element of r contains the largest representable @@ -16062,10 +16092,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector signed int - vector unsigned char + vector unsigned char - const int + const int (range [0,12]) @@ -16091,10 +16121,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector unsigned int - vector unsigned char + vector unsigned char - const int + const int (range [0,12]) @@ -16674,6 +16704,12 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Endian considerations: None. + + Notes: + Be careful to note that the address (b+c) is aligned to an + element boundary. Do not attempt to load unaligned data + with this intrinsic. + lvebx @@ -16859,6 +16895,12 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Endian considerations: None. + Notes: + This intrinsic can be used to indicate the last access to a + portion of memory, as a hint to the data cache controller that + the associated cache line can be replaced without performance + loss. + lvxl @@ -19224,8 +19266,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Copies the contents of the Vector Status and Control Register into the - result vector. + Copies the contents of the Vector Status and Control Register + (VSCR) into the result vector. Result value: The high-order 16 bits of the VSCR are copied into the seventh element of The contents of the VSCR are placed in the low-order 32 bits of the result vector, regardless of endianness. + Notes: + The use of vector unsigned short as the result + type eases access to the two bits currently defined in the + VSCR. Following execution of vec_mfvscr, + r[6] will contain + VSCRNJ in the low-order bit, and + r[7] will contain + VSCRSAT in the low-order bit. + mfvscr @@ -19750,19 +19801,69 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Returns a vector containing the results of performing a multiply-sum operation using the source vectors. - Result value: Assume that the - elements of each vector are numbered beginning with 0. If - a is a vector signed char or a vector - unsigned char vector, then let m be 4. Otherwise, - let m be 2. The value of each element - n of r is obtained - as follows. For p = mn to - mn + m – 1, multiply - element p of a - by element p of b. - Add the sum of these products to element n of - c. All additions are performed using - 32-bit modular arithmetic. + Result value: + There are two cases: + + + + When a is of type + vector signed char or vector unsigned char, each word + element of r is + computed as follows: + + + + Each of the four byte elements contained in the + corresponding word element of a is multiplied by the + corresponding byte element of b. + + + + + The sum of these four halfword products is added + to the corresponding word element in c and placed in the + corresponding word element of r. + + + + + + + + When a is of type + vector signed short or vector unsigned short, each word + element of r is + computed as follows: + + + + Each of the two halfword elements contained in the + corresponding word element of a is multiplied by the + corresponding halfword element of b. + + + + + The sum of these two word products is added + to the corresponding word element in c and placed in the + corresponding word element of r. + + + + + + + All operations are performed using 32-bit modular arithmetic. + + Endian considerations: None. @@ -20022,13 +20123,13 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vec_mtvscr Vector Move to Vector Status and Control Register - r = vec_mtvscr (a) + vec_mtvscr (a) Purpose: - Copies the given value into the Vector Status and Control Register. - The low-order 32 bits of a are copied - into the VSCR. + Copies a value into the Vector Status and Control Register + (VSCR). The low-order 32 bits of a are copied into the VSCR. Result value: None. @@ -20044,17 +20145,11 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Supported type signatures for vec_mtvscr - + - - - - r - - a @@ -20068,9 +20163,6 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - - void - vector bool char @@ -20081,9 +20173,6 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - - void - vector signed char @@ -20094,9 +20183,6 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - - void - vector unsigned char @@ -20107,9 +20193,6 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - - void - vector bool int @@ -20120,9 +20203,6 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - - void - vector signed int @@ -20133,9 +20213,6 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - - void - vector unsigned int @@ -20146,9 +20223,6 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - - void - vector pixel @@ -20159,9 +20233,6 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - - void - vector bool short @@ -20172,9 +20243,6 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - - void - vector signed short @@ -20185,9 +20253,6 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> - - void - vector unsigned short @@ -20212,8 +20277,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Returns a vector containing the results of performing a multiply - operation using the source vectors. + Compute the products of corresponding elements of two vectors. Result value: Each element of r receives the product of @@ -20970,7 +21034,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Result value: The value of each element of r is the negated absolute - value of the fcorresponding element of a. For integer vectors, the arithmetic is modular. Endian considerations: @@ -21147,7 +21211,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Performs a bitwise NAND of the given vectors. + Performs a bitwise NAND of two vectors. Result value: r is the bitwise NAND of a and programming model. - Notes: This intrinsic may - not yet be available in all implementations. - vncipher vec_ncipher_be @@ -21536,9 +21597,6 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> programming model. - Notes: This intrinsic may - not yet be available in all implementations. - vncipherlast vec_ncipherlast_be @@ -21608,12 +21666,14 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Returns a vector containing the floating-point integral values nearest to the values of the corresponding elements of the source vector. - Result value: Each element of - r contains the nearest representable - floating-point integral value to the value of the corresponding element - of a. When an input element value is - exactly between two integer values, the input value with the larger - absolute value is selected. + Result value: Each + element of r contains the + nearest representable floating-point integral value to the value + of the corresponding element of a. When an input element value is exactly + between two integer values, the input value with the larger + absolute value is selected. The current floating-point rounding + mode is ignored. Endian considerations: None. @@ -22075,7 +22135,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Performs a bitwise NOR of the given vectors. + Performs a bitwise NOR of two vectors. Result value: r is the bitwise NOR of a and Purpose: - Performs a bitwise OR of the given vectors. + Performs a bitwise OR of two vectors. Result value: r is the bitwise OR of a and Purpose: - Returns a vector that contains elements selected from two input - vectors, in the order specified by a third input vector. + Returns a vector that contains elements selected from two + vectors, in the order specified by a third vector. Result value: Let v be the concatenation of - The vec_perm built-in should only use permutations - that reorder vector elements of the specified type, not to reorder - bytes within those elements. The results are not guaranteed to be - consistent across big- and little-endian if you violate this rule. + The vec_perm built-in should only use + permutations that reorder vector elements of the specified + type, not to reorder bytes within those elements. The + results are not guaranteed to be consistent across big- + and little-endian if you violate this rule. See . @@ -24492,8 +24554,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Applies a permute and exclusive-OR operation on two input vectors of byte - elements, with the selected elements identified by a third input vector. + Applies a permute and exclusive-OR operation on two vectors of byte + elements, with the selected elements identified by a third vector. Result value: For each i (0 ≤ i < 16), let @@ -24801,7 +24863,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: Returns a vector containing the number of bits set in each element of - the input vector. + the source vector. Result value: The value of each element of r is the number of bits set @@ -24972,7 +25034,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: Returns a vector containing estimates of the reciprocals of the - corresponding elements of the input vector. + corresponding elements of the source vector. Result value: Each element of r contains the estimated value of the @@ -24981,6 +25043,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Endian considerations: None. + Notes: + For finite reciprocals, this intrinsic guarantees at least 14 + bits of accuracy. + xvredp @@ -25056,17 +25122,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Returns a vector containing refined approximations of the division of - the corresponding elements of a by the - corresponding elements of b. This - built-in function provides an implementation-dependent precision, which - is commonly within 2 ulps (units in the last place) for most of the - numeric range expressible by - the input operands. This built-in function does not correspond to a - single IEEE operation and does not provide the overflow, underflow, and - NaN propagation characteristics specified for IEEE division. (Precision - may be a function of both the specified target processor model during - compilation and the actual processor on which a program is executed.) + Returns a vector containing refined approximations of the + division of the corresponding elements of a by the corresponding elements of + b. Result value: Each element of r contains a refined approximation of @@ -25076,10 +25135,34 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Endian considerations: None. - Notes: The example implementation - for vector double assumes that a register z - initially contains the double-precision floating-point value 1.0 - in each doubleword. + Notes: + + + + The example implementation for vector double assumes + that a register z initially + contains the double-precision floating-point value 1.0 + in each doubleword. + + + + + For finite reciprocals, this intrinsic guarantees at + least 23 bits of accuracy for single-precision floating + point, and at least 52 bits of accuracy for + double-precision floating point. + + + + + This built-in function does not correspond to a single + IEEE operation and does not provide the overflow, + underflow, and NaN propagation characteristics specified + for IEEE division. + + + + xvredp @@ -25542,7 +25625,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Reverse the elements of a vector. Result value: Returns a vector - with the elements of the input vector in reversed order. + with the elements of the source vector in reversed order. Endian considerations: The vpermr instruction is most naturally used to implement this built-in function for a little-endian target, and the vperm instruction for a @@ -25787,7 +25870,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: Returns a vector containing the floating-point integral values nearest - to the values of the corresponding elements of the given vector. + to the values of the corresponding elements of the source vector. Result value: Each element of r contains the nearest representable @@ -26355,13 +26438,14 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: Returns a vector containing the rounded values of the corresponding - elements of the given vector. + elements of the source vector. Result value: Each element of r contains the value of the corresponding element of a, rounded to the nearest representable floating-point integer, using IEEE - round-to-nearest rounding. + round-to-nearest rounding. The current floating-point rounding + mode is ignored. Notes: This function might not follow the strict operation definition of the resolution of a tie during a round if the -qstrict=nooperationprecision compiler option is @@ -26445,7 +26529,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: Returns a vector containing a refined approximation of the reciprocal - square roots of the corresponding elements of the given vector. This + square roots of the corresponding elements of the source vector. This function provides an implementation-dependent greater precision than vec_rsqrte. Result value: Each element of @@ -26455,10 +26539,26 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Endian considerations: None. - Notes: The example implementations - assume that a register h initially - contains the floating-point value 0.5 in each element (single- or - double-precision as appropriate). + Notes: + + + + The example implementations assume that a register + h initially contains + the floating-point value 0.5 in each element (single- or + double-precision as appropriate). + + + + + For finite square roots, this intrinsic guarantees at + least 23 bits of accuracy for single-precision floating + point, and at least 52 bits of accuracy for + double-precision floating point. + + + + xvrsqrtedp @@ -26584,7 +26684,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: Returns a vector containing estimates of the reciprocal square roots of - the corresponding elements of the given vector. + the corresponding elements of the source vector. Result value: Each element of r contains the estimated value of the @@ -26593,6 +26693,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Endian considerations: None. + Notes: + For finite square roots, this intrinsic guarantees at least 14 + bits of accuracy. + xvrsqrtedp @@ -26686,9 +26790,6 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> programming model. - Notes: This intrinsic may - not yet be available in all implementations. - vsbox vec_sbox_be @@ -26746,9 +26847,9 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Returns a vector containing the value of either a or b - depending on the value of c. + Returns a vector selecting bits from two source vectors + depending on the corresponding bit values of a third source + vector. Result value: Each bit of r has the value of the corresponding @@ -27497,7 +27598,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> const int - const int + 4-bit unsigned literal @@ -27513,10 +27614,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector unsigned long long - const int + const int - const int + 4-bit unsigned literal @@ -27542,10 +27643,11 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Converts a vector of floating-point numbers to a vector of signed integers. - Result value: Each element of - r is obtained by truncating the - corresponding element of a to a signed - integer. + Result value: Each + element of r is obtained by + truncating the corresponding element of a to a signed integer. The current + floating-point rounding mode is ignored. Endian considerations: None. @@ -27735,7 +27837,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Converts elements of an input vector to signed integers and stores + Converts elements of a source vector to signed integers and stores them in the even-numbered elements of the result vector. Result value: Element 0 of @@ -27823,7 +27925,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Converts elements of an input vector to signed integers and stores them + Converts elements of a source vector to signed integers and stores them in the odd-numbered elements of the result vector. Result value: Element 1 of @@ -28134,7 +28236,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> role="bold">a and b is done in big-endian fashion (left to right), and the shift is always to the left. This will generally produce surprising results for - little-endian targets. + little-endian targets. See also . @@ -28186,10 +28289,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector bool char - vector bool char + vector bool char - const int + 4-bit unsigned literal @@ -28205,10 +28308,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector signed char - vector signed char + vector signed char - const int + 4-bit unsigned literal @@ -28224,10 +28327,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector unsigned char - vector unsigned char + vector unsigned char - const int + 4-bit unsigned literal @@ -28243,10 +28346,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector bool int - vector bool int + vector bool int - const int + 4-bit unsigned literal @@ -28262,10 +28365,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector signed int - vector signed int + vector signed int - const int + 4-bit unsigned literal @@ -28281,10 +28384,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector unsigned int - vector unsigned int + vector unsigned int - const int + 4-bit unsigned literal @@ -28300,10 +28403,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector bool long long - vector bool long long + vector bool long long - const int + 4-bit unsigned literal @@ -28319,10 +28422,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector signed long long - vector signed long long + vector signed long long - const int + 4-bit unsigned literal @@ -28338,10 +28441,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector unsigned long long - vector unsigned long long + vector unsigned long long - const int + 4-bit unsigned literal @@ -28357,10 +28460,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector pixel - vector pixel + vector pixel - const int + 4-bit unsigned literal @@ -28376,10 +28479,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector bool short - vector bool short + vector bool short - const int + 4-bit unsigned literal @@ -28395,10 +28498,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector signed short - vector signed short + vector signed short - const int + 4-bit unsigned literal @@ -28414,10 +28517,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector unsigned short - vector unsigned short + vector unsigned short - const int + 4-bit unsigned literal @@ -28433,10 +28536,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector double - vector double + vector double - const int + 4-bit unsigned literal @@ -28452,10 +28555,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector float - vector float + vector float - const int + 4-bit unsigned literal @@ -28478,7 +28581,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Returns a vector obtained by shifting left the concatenated input + Returns a vector obtained by shifting left the concatenated source vectors by the number of specified words. Result value: Vector vector signed char - vector signed char + vector signed char - const int + 2-bit unsigned literal @@ -28565,10 +28668,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector unsigned char - vector unsigned char + vector unsigned char - const int + 2-bit unsigned literal @@ -28584,10 +28687,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector signed int - vector signed int + vector signed int - const int + 2-bit unsigned literal @@ -28603,10 +28706,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector unsigned int - vector unsigned int + vector unsigned int - const int + 2-bit unsigned literal @@ -28622,10 +28725,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector signed long long - vector signed long long + vector signed long long - const int + 2-bit unsigned literal @@ -28641,10 +28744,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector unsigned long long - vector unsigned long long + vector unsigned long long - const int + 2-bit unsigned literal @@ -28660,10 +28763,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector signed short - vector signed short + vector signed short - const int + 2-bit unsigned literal @@ -28679,10 +28782,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector unsigned short - vector unsigned short + vector unsigned short - const int + 2-bit unsigned literal @@ -31159,7 +31262,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vec_sro in big-endian code must be rewritten for little-endian targets. The shift count is in element 15 of b for big-endian, but in element 0 of b - for little-endian. + for little-endian. See also . @@ -32200,6 +32303,12 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Endian considerations: None. + + Notes: + Be careful to note that the address (b+c) is aligned to an + element boundary. Do not attempt to store unaligned data with + this intrinsic. + stvebx @@ -32514,6 +32623,12 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Endian considerations: None. + Notes: + This intrinsic can be used to indicate the last access to a + portion of memory, as a hint to the data cache controller that + the associated cache line can be replaced without performance + loss. + stvxl @@ -33342,8 +33457,9 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Returns a vector containing the carry produced by subtracting each set - of corresponding elements of the given vectors. + Returns a vector wherein each element contains the carry + produced by subtracting the corresponding elements of the two + source vectors. Result value: The value of each element of r is the complement of the @@ -33489,6 +33605,11 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Endian considerations: None. + Notes: + Code generated for this intrinsic should ensure only the + low-order bit of c participates + in the sum. + vspltisw @@ -33655,6 +33776,11 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Endian considerations: None. + Notes: + Code generated for this intrinsic should ensure only the + low-order bit of c participates + in the sum. + vspltisw @@ -34102,14 +34228,34 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> operation within each word of the first source vector together with accumulated results in the second source vector. - Result value: If a is a vector of signed or unsigned char, then - let m be 4; otherwise, let m - be 2. For each element n of the result vector, the - value is obtained by adding elements mn through - mn + m – 1 of a and element n of b using saturated addition. + Result value: + There are two cases: + + + + a is a vector of signed + or unsigned char. For each element + n of the result vector, the value + is obtained by adding elements 4n + through 4n + 3 of a and element + n of b using saturated addition. + + + + + a is a vector of signed + short. For each element n of the + result vector, the value is obtained by adding elements + 2n and 2n + 1 + of a and element + n of b using saturated addition. + + + + Endian considerations: None. @@ -34397,7 +34543,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector float - const int + 7-bit unsigned literal @@ -34416,7 +34562,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> vector double - const int + 7-bit unsigned literal @@ -34443,7 +34589,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: Returns a vector containing the truncated values of the corresponding - elements of the given vector. + elements of the source vector. Result value: Each element of r contains the value of the @@ -35099,7 +35245,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Result value: Each element of r is obtained by truncating the corresponding element of a to an - unsigned integer. + unsigned integer. The current floating-point rounding mode is + ignored. Endian considerations: None. @@ -35186,7 +35333,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> role="bold">a and b. Each element of r is obtained by truncating the corresponding element of v to an - unsigned 32-bit integer. + unsigned 32-bit integer. The current floating-point rounding + mode is ignored. Endian considerations: The element numbering within a register is left-to-right for big-endian targets, and right-to-left for little-endian targets. @@ -35290,7 +35438,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Converts elements of an input vector to unsigned integers and stores + Converts elements of the source vector to unsigned integers and stores them in the even-numbered elements of the result vector. Result value: Element 0 of @@ -35380,7 +35528,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Converts elements of an input vector to unsigned integers and stores + Converts elements of the source vector to unsigned integers and stores them in the odd-numbered elements of the result vector. Result value: Element 1 of @@ -35502,7 +35650,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> GCC provides a commonly used synonym for vec_xl called vec_vsx_ld. Although these have the same behavior, only vec_xl is guaranteed to be portable across compliant - compilers. vec_xl should be preferred. + compilers. Therefore vec_xl is preferred. @@ -36428,6 +36576,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Endian considerations: None. + Notes: vec_xl_len_r should + not be used to load from cache-inhibited memory. sldi @@ -36518,7 +36668,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Purpose: - Performs a bitwise XOR of the given vectors. + Performs a bitwise XOR of two vectors. Result value: v is the bitwise exclusive OR of a and GCC provides a commonly used synonym for vec_xst called vec_vsx_st. Although these have the same behavior, only vec_xst is guaranteed to be portable across compliant - compilers. vec_xst should be preferred. + compilers. Therefore vec_xst is preferred. @@ -37439,6 +37589,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Endian considerations: None. + Notes: vec_xst_len should + not be used to store to cache-inhibited memory. sldi @@ -37753,6 +37905,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> Endian considerations: None. + Notes: vec_xst_len_r should + not be used to store to cache-inhibited memory. lvsr @@ -37834,4 +37988,5 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref"> +