forked from cores/microwatt

Browse Source
###

This implements an alternative count-leading-zeroes algorithm which uses less LUTs to generate the higher-order bits (2..5) of the result. By doing (v | -v) rather than (v & -v), we get a value which has ones from the MSB down to the rightmost 1 bit in v and then zeroes down to the LSB. This means that we can generate the MSB of the result (the index of the rightmost 1 bit in v) just by looking at bits 63 and 31 of (v | -v), assuming that v is 64 bits. Bit 4 of the result requires looking at bits 63, 47, 31 and 15. In contrast, each bit of the result using (v & -v), which has a single 1, requires ORing together 32 bits. It turns out that the minimum LUT usage comes from using (v & -v) to generate bits 0 and 1 of the result, and using (v | -v) to generate bits 2 to 5. This saves almost 60 6-input LUTs on the Artix-7. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>fpu-constant

**Paul Mackerras**1 year ago

**1 changed files**with

**32 additions**and

**7 deletions**

Loading…

Reference in new issue

**Repository**

matthew1kalasky3/microwatt

**Title**

**Body**