This changes the decoding of major opcode 19 from using the ppc_insn_t
index to using bits of the instruction word directly. Opcode 19 has
a 10-bit minor opcode field (bits 10..1) but the space is sparsely
filled. Therefore we index a table of single-bit entries with the
10-bit minor opcode to filter out the illegal minor opcodes, and
index a table using just 3 bits -- 5, 3 and 2 -- of the instruction
to get the decode entry. This groups together all the instructions
in 4 columns of the opcode map as a single entry. That means that
mcrf and all the CR logical ops get grouped together, and bcctr, bclr
and bctar get grouped together. At present the CR logical ops are not
implemented, so their grouping has no impact.
The code for bclr and bcctr in execute1 is now common, using a single
op, and it now determines the branch address by looking at bit 10 of
the instruction word at execute time.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
With this, we have a table for most major opcodes and separate
tables for each major opcode that has further decoding required.
These tables are still mostly indexed by the ppc_insn_t values,
however.
A few things are still decoded completely at the top level: nop,
attn and sim_config.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Instead of doing mfctr, mflr, mftb, mtctr, mtlr as separate ops,
just pass down mfspr and mtspr ops with the spr number and let
execute1 decode which SPR we're addressing. This will help reduce
the number of instruction bits decode1 needs to look at.
In fact we now pass down the whole instruction from decode2 to
execute1. We will need more bits of the instruction in future,
and the tools should just optimize away any that we don't end
up using. Since the 'aa' bit was just a copy of an instruction
bit, we can now remove it from the record.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Hopefully it's not too timing catastrophic. The variable newcrf will
be handy for the other CR ops when we implement them I suspect.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This adds a divider unit, connected to the core in much the same way
that the multiplier unit is connected. The division algorithm is
very simple-minded, taking 64 clock cycles for any division (even
32-bit division instructions).
The decoding is simplified by making use of regularities in the
instruction encoding for div* and mod* instructions. Instead of
having PPC_* encodings from the first-stage decoder for each of the
different div* and mod* instructions, we now just have PPC_DIV and
PPC_MOD, and the inputs to the divider that indicate what sort of
division operation to do are derived from instruction word bits.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This module adds some simple core controls:
reset, stop, start, step
along with icache clear and reading the NIA and core
status bits
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org
We need to finish support for all the trap instructions, but for now
we at least need a decode entry for tw, so we know to stall until the
previous instruction completes. Some of our test cases were failing
because the trap executed before the previous instruction completed.
All these trap instructions need to be resolved at completion, not
in execute.
Signed-off-by: Anton Blanchard <anton@linux.ibm.com>
We can force all existing code to use the UART console
by passing 0 in bit zero of the sim config register.
Signed-off-by: Anton Blanchard <anton@linux.ibm.com>
Handle the CR as a single field with per nibble enables. Forward any
writes in the same cycle.
If this proves to be an issue for timing, we may want to revisit
this in the future. For now, it keeps things simple.
Signed-off-by: Anton Blanchard <anton@linux.ibm.com>