Commit Graph

6 Commits (ff63ffdbfd969c7eaeb54e5369b655dea080256c)

Author SHA1 Message Date
Paul Mackerras 21ab36a0c0 Pre-decode instructions when writing them to icache
This splits out the decoding done in the decode0 step into a separate
predecoder, used when writing instructions into the icache.  The
icache now holds 36 bits per instruction rather than 32.  For valid
instructions, those 36 bits comprise the bottom 26 bits of the
instruction word, a 9-bit insn_code value (which uniquely identifies
the instruction), and a zero in the MSB.  For illegal instructions,
the MSB is one and the full instruction word is in the bottom 32 bits.
Having the full instruction word available for illegal instructions
means that it can be printed in the log when simulating, or in future
could be placed in the HEIR register.

If we don't have an FPU, then the floating-point instructions are
regarded as illegal.  In that case, the insn_code values would fit
into 8 bits, which could be used in future to reduce the size of
decode_rom from 512 to 256 entries.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2 years ago
Anton Blanchard 74254bf11a Reformat cache_ram
Signed-off-by: Anton Blanchard <anton@linux.ibm.com>
4 years ago
Benjamin Herrenschmidt ecaa5e2fb2 dcache: Rework RAM wrapper to synthetize better on Xilinx
The global wr_en signal is causing Vivado to generate two TDP (True Dual Port)
block RAMs instead of one SDP (Simple Dual Port) for each cache way. Remove
it and instead apply a AND to the individual byte write enables.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
5 years ago
Benjamin Herrenschmidt 174378b190 dcache: Introduce an extra cycle latency to make timing
This makes the BRAMs use an output buffer, introducing an extra
cycle latency. Without this, Vivado won't make timing at 100Mhz.

We stash all the necessary response data in delayed latches, the
extra cycle is NOT a state in the state machine, thus it's fully
pipelined and doesn't involve stalling.

This introduces an extra non-pipelined cycle for loads with update
to avoid collision on the writeback output between the now delayed
load data and the register update. We could avoid it by moving
the register update in the pipeline bubble created by the extra
update state, but it's a bit trickier, so I leave that for a latter
optimization.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
5 years ago
Benjamin Herrenschmidt a38ae503ff cache_ram: Add write-enables
They will be needed by the dcache

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
5 years ago
Benjamin Herrenschmidt b56b46b7d1 icache: Set associative icache
This adds support for set associativity to the icache. It can still
be direct mapped by setting NUM_WAYS to 1.

The replacement policy uses a simple tree-PLRU for each set.

This is only lightly tested, tests pass but I have to double check
that we are using the ways effectively and not creating duplicates.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
5 years ago