diff --git a/Intrinsics_Reference/ch_biendian.xml b/Intrinsics_Reference/ch_biendian.xml index 8b7a82a..8c4d76a 100644 --- a/Intrinsics_Reference/ch_biendian.xml +++ b/Intrinsics_Reference/ch_biendian.xml @@ -1505,6 +1505,245 @@ a[3] = c; and the second integer element of b, in that order. + + The big endian result is {0x00010203, 0x1c1d1e1f, 0x0c0d0e0f, + 0x14151617}, as shown here: + + + + + + + + + + + + + + + + + + + + + + + + a + + + 00 + + + 01 + + + 02 + + + 03 + + + 04 + + + 05 + + + 06 + + + 07 + + + 08 + + + 09 + + + 0A + + + 0B + + + 0C + + + 0D + + + 0E + + + 0F + + + + + b + + + 10 + + + 11 + + + 12 + + + 13 + + + 14 + + + 15 + + + 16 + + + 17 + + + 18 + + + 19 + + + 1A + + + 1B + + + 1C + + + 1D + + + 1E + + + 1F + + + + + c + + + 0 + + + 1 + + + 2 + + + 3 + + + 28 + + + 29 + + + 30 + + + 31 + + + 12 + + + 13 + + + 14 + + + 15 + + + 20 + + + 21 + + + 22 + + + 23 + + + + + t + + + 00 + + + 01 + + + 02 + + + 03 + + + 1C + + + 1D + + + 1E + + + 1F + + + 0C + + + 0D + + + 0E + + + 0F + + + 14 + + + 15 + + + 16 + + + 17 + + + + + For little endian, the modified PCV is elementwise subtracted from 31, giving {31,30,29,28,3,2,1,0,19,18,17,16,11,10,9,8}. @@ -1515,10 +1754,247 @@ a[3] = c; vperm instruction will again select entire elements using the groups of 4 contiguous bytes, and the values of the integers will be reordered without compromising - each integer's contents. The fact that the little-endian - result matches the big-endian result is left as an exercise - for the reader. + each integer's contents. The little-endian result matches the + big-endian result, as shown. Observe that a and b switch positions for little endian + code generation. + + + + + + + + + + + + + + + + + + + + + + + b + + + 1C + + + 1D + + + 1E + + + 1F + + + 18 + + + 19 + + + 1A + + + 1B + + + 14 + + + 15 + + + 16 + + + 17 + + + 10 + + + 11 + + + 12 + + + 13 + + + + + a + + + 0C + + + 0D + + + 0E + + + 0F + + + 08 + + + 09 + + + 0A + + + 0B + + + 04 + + + 05 + + + 06 + + + 07 + + + 00 + + + 01 + + + 02 + + + 03 + + + + + c + + + 8 + + + 9 + + + 10 + + + 11 + + + 16 + + + 17 + + + 18 + + + 19 + + + 0 + + + 1 + + + 2 + + + 3 + + + 28 + + + 29 + + + 30 + + + 31 + + + + + t + + + 14 + + + 15 + + + 16 + + + 17 + + + 0C + + + 0D + + + 0E + + + 0F + + + 1C + + + 1D + + + 1E + + + 1F + + + 00 + + + 01 + + + 02 + + + 03 + + + + + Now, suppose instead that the original PCV does not reorder entire integers at once: @@ -1528,6 +2004,241 @@ a[3] = c; The result of the big-endian implementation would be: t = {0x00141f04, 0x07110613, 0x1e030208, 0x090d0516}; + + + + + + + + + + + + + + + + + + + + + + + a + + + 00 + + + 01 + + + 02 + + + 03 + + + 04 + + + 05 + + + 06 + + + 07 + + + 08 + + + 09 + + + 0A + + + 0B + + + 0C + + + 0D + + + 0E + + + 0F + + + + + b + + + 10 + + + 11 + + + 12 + + + 13 + + + 14 + + + 15 + + + 16 + + + 17 + + + 18 + + + 19 + + + 1A + + + 1B + + + 1C + + + 1D + + + 1E + + + 1F + + + + + c + + + 0 + + + 20 + + + 31 + + + 4 + + + 7 + + + 17 + + + 6 + + + 19 + + + 30 + + + 3 + + + 2 + + + 8 + + + 9 + + + 13 + + + 5 + + + 22 + + + + + t + + + 00 + + + 14 + + + 1F + + + 04 + + + 07 + + + 11 + + + 06 + + + 13 + + + 1E + + + 03 + + + 02 + + + 08 + + + 09 + + + 0D + + + 05 + + + 16 + + + + + For little-endian, the modified PCV would be {31,11,0,27,24,14,25,12,1,28,29,23,22,18,26,9}, appearing in @@ -1539,6 +2250,241 @@ a[3] = c; which bears no resemblance to the big-endian result. + + + + + + + + + + + + + + + + + + + + + + + b + + + 1C + + + 1D + + + 1E + + + 1F + + + 18 + + + 19 + + + 1A + + + 1B + + + 14 + + + 15 + + + 16 + + + 17 + + + 10 + + + 11 + + + 12 + + + 13 + + + + + a + + + 0C + + + 0D + + + 0E + + + 0F + + + 08 + + + 09 + + + 0A + + + 0B + + + 04 + + + 05 + + + 06 + + + 07 + + + 00 + + + 01 + + + 02 + + + 03 + + + + + c + + + 9 + + + 26 + + + 18 + + + 22 + + + 23 + + + 29 + + + 28 + + + 1 + + + 12 + + + 25 + + + 14 + + + 24 + + + 27 + + + 0 + + + 11 + + + 31 + + + + + t + + + 15 + + + 06 + + + 0E + + + 0A + + + 0B + + + 01 + + + 00 + + + 1D + + + 10 + + + 05 + + + 12 + + + 04 + + + 07 + + + 1C + + + 17 + + + 03 + + + + + The lesson here is to only use vec_perm to reorder entire elements of a vector. If you must use vec_perm