Commit Graph

2 Commits (8f537c13bc1fe4b3a24c941d469316d61483cbcb)

Author SHA1 Message Date
Paul Mackerras 23ff954059 core: Change bperm to a simpler and slower implementation
This does bperm in the bitsort unit instead of the logical unit, and
no longer tries to do it in a single cycle with eight 64-to-1
multiplexers.  Instead it is now a state machine in the bitsort unit,
takes 8 cycles, and only has one 64-to-1 multiplexer.  This helps
improve timing and reduces LUT usage.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2 weeks ago
Paul Mackerras fa9df33f7e Implement cfuged, pdepd and pextd
This implements the cfuged, pdepd and pextd instructions in a new unit
called bit_sorter (so called because cfuged and pextd can be viewed as
sorting the bits of the mask).

The cnt* instructions and the popcnt* instructions now use the same
OP_COUNTB insn_type so as to free up an insn_type value to use for the
new instructions.

The new instructions are implemented using a slow and simple algorithm
that takes 64 cycles to compute the result.  The ex1 stage is stalled
while this happens, as for a 64-bit multiply, or for a divide when
there is no FPU.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
1 month ago