The global wr_en signal is causing Vivado to generate two TDP (True Dual Port)
block RAMs instead of one SDP (Simple Dual Port) for each cache way. Remove
it and instead apply a AND to the individual byte write enables.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The two tests obtain NIA with bl+mflr+addi and then compare it
against addpcis with the minimum and maximum immediate operand values.
They were also tested on a real POWER9 system (in userspace) for good
measure.
Signed-off-by: Shawn Anastasio <shawn@anastas.io>
This commit adds support for the addpcis instruction from ISA 3.0.
A new input_reg_b_t type, CONST_DX_HI, was added to support the
shifted immediate value used in DX-Form instructions.
Signed-off-by: Shawn Anastasio <shawn@anastas.io>
Use a simple wire. common.vhdl types are better kept for things
local to the core. We can add more wires later if we need to for
HV irqs etc...
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This changes the SoC interconnect such that the main 64-bit wishbone out
of the processor is first split between only 3 slaves (BRAM, DRAM and a
general "IO" bus) instead of all the slaves in the SoC.
The IO bus leg is then latched and down-converted to 32 bits data width,
before going through a second address decoder for the various IO devices.
This significantly reduces routing and timing pressure on the main bus,
allowing to get rid of frequent timing violations when synthetizing on
small'ish FPGAs such as the Artix-7 35T found on the original Arty board.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We still need to a way to our FPGA target on the command line, but this
at least gets us down to a common Makefile.
Signed-off-by: Anton Blanchard <anton@linux.ibm.com>
Instead of building each file one by one (and having to track all
the dependencies manually), use the ghdl -c command that does
analysis and elaboration in one go.
Signed-off-by: Anton Blanchard <anton@linux.ibm.com>
Add a microwatt-verilator target that simulates the
ghdl -> yosys -> verilog -> verilator path. A good test of
ghdl/yosys synthesis.
Because the everything is run through synthesis, the instruction
image is baked into the build via the RAM_INIT_FILE generic.
Signed-off-by: Anton Blanchard <anton@linux.ibm.com>
This describes how to build the tool on Fedora, and on Debian which lacks a packaged
liburjtag as of mid 2020.
Signed-off-by: Joel Stanley <joel@jms.id.au>
CFLAGS was defined but not used anywhere. This adds them to the compile
line, and fixes the warnings (and errors!) that result.
Signed-off-by: Joel Stanley <joel@jms.id.au>
- Changing use of others in core files to satisfy VCS
- Adding workaround for VCS subtype constraint inconsistencies in common.vhdl
Signed-off-by: Jonathan Balkind <jbalkind@princeton.edu>
This adds one-cycle latches to the various resets out of the soc and
into the various core modules. It *seems* to help vivado P&R a bit
and has shown to avoid timing violations under some circumstances.
Interestingly those resets never seem to appear in the bad timing
path. It looks like those long resets simply impose placement
constraints that Vivado satisfies at the expense of timing elsewhere.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When using litedram, request a much longer PLL reset. This seems to
help get rid of all the grabled output after config.
Also use the clean system_rst out of litedram as our source of reset
for the rest of the SoC (it is synchronized with system_clk and takes
pll_locked into account already)
In some cases we need to keep the reset held for much longer,
so use counters rather than shift registers.
Additionally, some signals such as ext_rst and pll_locked
or signals going from the ext_clk domain to the pll_clk
domain need to be treated as async, and testing them without
synchronizers is asking for trouble.
Finally, make the external reset also reset the PLL.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Things have changed a bit in upstream LiteX. LiteDRAM now exposes a
wishbone for the CSRs for example.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This fixes a bug where a store that hits in the dcache immediately
following a dcbz has its write to the cache RAM suppressed (but not
its write to memory). If a load to the same location comes along
before the cache line gets replaced, the load will return incorrect
data.
Fixes: 4db1676ef8 ("dcache: Don't assert on dcbz cache hit")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
The rx synchronizers were ... non existent. Someone forgot to add
a if rising_edge(clk) to the process.
For tx, ensure that we have a default value so that TX stays high
from TPGA configuration to the reset being sampled on the first clock
cycle.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The pp_fifo decides whether top = bottom means empty or full based
on whether the previous operation was a push or a pop.
If the fifo performs both in one cycle, it sets the previous op to
pop. That means that a full fifo being added a character and removed
one at the same time becomes empty.
Instead, just leave the previous op alone. If the fifo was empty, it
remains so, if it was full ditto.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>