Examples implemented using other intrinsics Some intrinsic implementations are defined in terms of other intrinsics. For example. This notion of using part (one fourth or half) of the SSE XMM register and leaving the rest unchanged (or forced to zero) is specific to SSE scalar operations and can generate some complicated (sub-optimal) PowerISA code.  In this case _mm_load_sd passes the dereferenced double value  to _mm_set_sd which uses C vector initializer notation to combine (merge) that double scalar value with a scalar 0.0 constant into a vector double. While code like this should work as-is for PPC64LE, you should look at the generated code and assess if it is reasonable.  In this case the code is not awful (a load double splat, vector xor to generate 0.0s, then a xxmrghd to combine __F and 0.0).  Other examples may generate sub-optimal code and justify a rewrite to PowerISA scalar or vector code ( GCC PowerPC AltiVec Built-in Functions or inline assembler). Try using the existing C code if you can, but check on what the compiler generates.  If the generated code is horrendous, it may be worth the effort to write a PowerISA specific equivalent. For codes making extensive use of MMX or SSE scalar intrinsics you will be better off rewriting to use standard C scalar types and letting the GCC compiler handle the details (see ).