Commit Graph

17 Commits (cff4b13a9b8d0036f9441c150bfd8e1d7bc620e3)

Author SHA1 Message Date
Benjamin Herrenschmidt 48f260761b writeback: Slightly improve timing
The CR update currently depends on the complete data formatting
mux chain. This makes it source its inputs from a bit earlier in
the chian, thus improving timing a bit

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
5 years ago
Benjamin Herrenschmidt b513f0fb48 dcache: Add a dcache
This replaces loadstore2 with a dcache

The dcache unit is losely based on the icache one (same basic cache
layout), but has some significant logic additions to deal with stores,
loads with update, non-cachable accesses and other differences due to
operating in the execution part of the pipeline rather than the fetch
part.

The cache is store-through, though a hit with an existing line will
update the line rather than invalidate it.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
5 years ago
Benjamin Herrenschmidt 858b1e7930 writeback: Remove a mux leg on data_in
Initialize to 0 forces the mux to have an extra leg fed with zeros.

Instead initialize data_in to one of the mux inputs

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
5 years ago
Paul Mackerras 57b200d6cb writeback: Eliminate inferred latch
This initializes data_in to all zeroes so that it doesn't become a
set of 64 inferred latches.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
5 years ago
Paul Mackerras f49a5a99a5 Remove execute2 stage
Since the condition setting got moved to writeback, execute2 does
nothing aside from wasting a cycle.  This removes it.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
5 years ago
Paul Mackerras 9646fe28b0 Do sign-extension instructions in writeback instead of execute1
This makes the exts[bhw] instructions do the sign extension in the
writeback stage using the sign-extension logic there instead of
having unique sign extension logic in execute1.  This requires
passing the data length and sign extend flag from decode2 down
through execute1 and execute2 and into writeback.  As a side bonus
we reduce the number of values in insn_type_t by two.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
5 years ago
Paul Mackerras 374f4c536d writeback: Do data formatting and condition recording in writeback
This adds code to writeback to format data and test the result
against zero for the purpose of setting CR0.  The data formatter
is able to shift and mask by bytes and do byte reversal and sign
extension.  It can also put together bytes from two input
doublewords to support unaligned loads (including unaligned
byte-reversed loads).

The data formatter starts with an 8:1 multiplexer that is able
to direct any byte of the input to any byte of the output.  This
lets us rotate the data and simultaneously byte-reverse it.
The rotated/reversed data goes to a register for the unaligned
cases that overlap two doublewords.  Then there is per-byte logic
that does trimming, sign extension, and splicing together bytes
from a previous input doubleword (stored in data_latched) and the
current doubleword.  Finally the 64-bit result is tested to set
CR0 if rc = 1.

This removes the RC logic from the execute2, multiply and divide
units, and the shift/mask/byte-reverse/sign-extend logic from
loadstore2.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
5 years ago
Paul Mackerras d5bc6c8824 Add a divider unit and a testbench for it
This adds a divider unit, connected to the core in much the same way
that the multiplier unit is connected.  The division algorithm is
very simple-minded, taking 64 clock cycles for any division (even
32-bit division instructions).

The decoding is simplified by making use of regularities in the
instruction encoding for div* and mod* instructions.  Instead of
having PPC_* encodings from the first-stage decoder for each of the
different div* and mod* instructions, we now just have PPC_DIV and
PPC_MOD, and the inputs to the divider that indicate what sort of
division operation to do are derived from instruction word bits.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
5 years ago
Anton Blanchard 152261fac8 Remove cycle in writeback
The pipeline had a cycle in writeback. Writeback is pretty
simple and unlikely to be a bottleneck, so lets remove it.

Signed-off-by: Anton Blanchard <anton@linux.ibm.com>
5 years ago
Anton Blanchard e69e79d8af Reformat writeback.vhdl
Signed-off-by: Anton Blanchard <anton@linux.ibm.com>
5 years ago
Anton Blanchard 50a361a5dc Exit if we try to write more than one GPR or CR in a cycle
Signed-off-by: Anton Blanchard <anton@linux.ibm.com>
5 years ago
Anton Blanchard 9fe8d211eb Register outputs on writeback
Signed-off-by: Anton Blanchard <anton@linux.ibm.com>
5 years ago
Anton Blanchard fa04936c92 Add some assertions to writeback
We want to make sure we never complete more than one
instruction per cycle, or write back more than one GPR
or CR per cycle.

Signed-off-by: Anton Blanchard <anton@linux.ibm.com>
5 years ago
Anton Blanchard fb4cad6eaf Remove second write port
We only need two write ports for load with update instructions.
Having two write ports just for this instruction is expensive.

For now we will force them to be the only instruction in the
pipeline, and take two cycles of writeback.

Signed-off-by: Anton Blanchard <anton@linux.ibm.com>
5 years ago
Anton Blanchard 9fbaea6f08 Rework CR file and add forwarding
Handle the CR as a single field with per nibble enables. Forward any
writes in the same cycle.

If this proves to be an issue for timing, we may want to revisit
this in the future. For now, it keeps things simple.

Signed-off-by: Anton Blanchard <anton@linux.ibm.com>
5 years ago
Anton Blanchard 147b259691 Use a better input signal in writeback
w_in comes from the execution unit, it makes more sense to call
it e_in.

Signed-off-by: Anton Blanchard <anton@linux.ibm.com>
5 years ago
Anton Blanchard 5a29cb4699 Initial import of microwatt
Signed-off-by: Anton Blanchard <anton@linux.ibm.com>
5 years ago