To vec_not or not
Well not exactly. Looking at the OpenPOWER ABI document we see a
reference to
vec_cmpne for all numeric types. But when we look in the current
GCC 6 documentation we find that
vec_cmpne is not on the list. So it is planned
in the ABI, but not implemented yet.
Looking at the PowerISA 2.07B we find a VSX Vector Compare Equal to
Double-Precision but no Not Equal. In fact we see only vector double compare
instructions for greater than and greater than or equal in addition to the
equal compare. Not only can't we find a not equal, there is no less than or
less than or equal compares either.
So what is going on here? Partially this is the Reduced Instruction
Set Computer (RISC) design philosophy. In this case the compiler can generate
all the required compares using the existing vector instructions and simple
transforms based on Boolean algebra. So
vec_cmpne(A,B) is simply vec_not
(vec_cmpeq(A,B)). And vec_cmplt(A,B) is simply
vec_cmpgt(B,A) based on the
identity A < B iff B > A.
Similarly vec_cmple(A,B) is implemented as
vec_cmpge(B,A).
What a minute, there is no vec_not() either. Can not find it in the
PowerISA, the OpenPOWER ABI, or the GCC PowerPC Altivec Built-in documentation.
There is no vec_move() either! How can this possibly work?
This is RISC philosophy again. We can always use a logical
instruction (like bit wise and or
or) to effect a move given that we also have
nondestructive 3 register instruction forms. In the PowerISA most instruction
have two input registers and a separate result register. So if the result
register number is different from either input register then the inputs are
not clobbered (nondestructive). Of course nothing prevents you from specifying
the same register for both inputs or even all three registers (result and both
inputs). And some times it is useful.
The statement B = vec_or (A,A) is is effectively a vector move/copy
from A to B. And A = vec_or (A,A) is obviously a
nop (no operation). In the the
PowerISA defines the preferred nop and register move for vector registers in
this way.
It is also useful to have hardware implement the logical operators
nor (not or)
and nand (not and).
The PowerISA provides these instruction for
fixed point and vector logical operation. So vec_not(A)
can be implemented as vec_nor(A,A).
So looking at the implementation of _mm_cmpne we propose the
following:
The Intel Intrinsics also include the not forms of the relational
compares:
The PowerISA and OpenPOWER ABI, or GCC PowerPC Altivec Built-in
documentation do not provide any direct equivalents to the not greater than
class of compares. Again you don't really need them if you know Boolean
algebra. We can use identities like
{not (A < B) iff A >= B} and
{not (A
<= B) iff A > B}. So the PPC64LE implementation follows:
These patterns repeat for the scalar version of the
not compares. And
in general the larger pattern described in this chapter applies to the other
float and integer types with similar interfaces.